Files
ScadaBridge/docs/plans/2026-06-15-stillpending-phase1-implementation.md
T
Joseph Doherty 9aa1259504 docs(plans): Phase 1 (M1-M4) implementation plan for stillpending.md
Bite-sized TDD plan. M1 (runtime wiring) fully detailed across 10 tasks
after verifying the purge/reconciliation actors already exist and only
need Host wiring + a gRPC pull client + event-logger injection. M2/M3/M4
as right-sized task inventories with files, classification, and AC.
Co-located .tasks.json for executing-plans resume.
2026-06-15 09:32:14 -04:00

305 lines
24 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# stillpending.md Phase 1 (M1M4) Implementation Plan
> **For Claude:** REQUIRED SUB-SKILL: Use superpowers-extended-cc:executing-plans to implement this plan task-by-task.
**Goal:** Stabilize ScadaBridge — make the Tier-1 silent gaps actually run, correct the Tier-2 behavioral divergences, and reconcile the Tier-4 doc↔code drift, per `docs/plans/2026-06-15-stillpending-completion-design.md`.
**Architecture:** Risk-first. M1 wires already-implemented-but-never-started central actors + fills the site event-log categories. M2 corrects behavioral gaps with targeted, test-first edits. M3 replaces the fake script "compiler" with a real Roslyn compile + semantic forbidden-API enforcement. M4 is doc-only reconciliation. Each task is independently shippable; spec + code + tests + deploy travel together (CLAUDE.md).
**Tech Stack:** C#/.NET 10, Akka.NET 1.5 (cluster singletons), EF Core 10 (MS SQL + SQLite), gRPC (sitestream.proto), Roslyn (`Microsoft.CodeAnalysis.CSharp.Scripting` — already a referenced package), xUnit + FluentAssertions + NSubstitute/Moq, Blazor Server.
**Build/test commands (repo-wide):**
- Build: `dotnet build ZB.MOM.WW.ScadaBridge.slnx`
- Test one project: `dotnet test tests/<Project>/<Project>.csproj`
- Filter one test: `dotnet test tests/<Project>/<Project>.csproj --filter "FullyQualifiedName~<TestName>"`
- Cluster runtime change: rebuild image with `bash docker/deploy.sh`
---
## Discovery (verified before planning)
- `AuditLogPurgeActor` (`src/ZB.MOM.WW.ScadaBridge.AuditLog/Central/AuditLogPurgeActor.cs`) and `SiteAuditReconciliationActor` (`.../Central/SiteAuditReconciliationActor.cs`) are **fully implemented** — timers, cursors, stalled detection, per-row retry budget, error isolation. They are simply **never instantiated**. M1 wires them; it does not write them.
- `IPullAuditEventsClient` (`.../Central/IPullAuditEventsClient.cs`) has a documented NoOp default and no production gRPC implementation. The reconciliation actor runs harmlessly against the NoOp today.
- Central singletons are created in `AkkaHostedService.RegisterCentralActors` (`src/ZB.MOM.WW.ScadaBridge.Host/Actors/AkkaHostedService.cs:~592`), using the `ClusterSingletonManager` + `ClusterSingletonProxy` pattern (see AuditLogIngest `:460`, SiteCallAudit `:524`). This is the insertion point for M1.1/M1.2.
- `AddAuditLogCentralMaintenance` (`.../AuditLog/ServiceCollectionExtensions.cs:319`) already registers the partition-roll-*forward* hosted service + central health snapshot; it does NOT register the purge/reconciliation **actors** (those need the ActorSystem, so they're Host-wired, not DI).
- `ISiteEventLogger.LogEventAsync(eventType, severity, instanceId, source, message, details?)` (`.../SiteEventLogging/ISiteEventLogger.cs`) is the emit surface. Only `connection` + `script` (error-path) categories are emitted today.
- `SiteCallAuditActor` (`.../SiteCallAudit/SiteCallAuditActor.cs:24-34`) explicitly defers its reconciliation puller + purge scheduler; `SiteCallAuditRepository.PurgeTerminalAsync` (`.../ConfigurationDatabase/Repositories/SiteCallAuditRepository.cs:213`) exists but is never invoked.
**Open risk flagged for M1.1:** whether `sitestream.proto` already exposes a server-streaming/unary `PullAuditEvents` RPC and a site-side handler reading `ISiteAuditQueue.ReadPendingSinceAsync`. Task M1.1 begins by confirming this; if absent, the proto + site handler are added first (sub-tasks M1.1a/b). Do not assume.
---
# Milestone M1 — Runtime wiring (Tier 1 #3, #4, #5, #6)
Wire behavior that exists but never starts, and fill the event-log categories.
### Task M1.0: Confirm proto/site surface for audit pull (spike)
**Classification:** trivial (investigation; no production code)
**Estimated implement time:** ~4 min
**Parallelizable with:** M1.5, M1.6, M1.7, M1.8 (event-logging tasks)
**Files:**
- Read: `src/ZB.MOM.WW.ScadaBridge.Communication/**/sitestream.proto` (or wherever the proto lives — `find . -name "*.proto"`)
- Read: `src/ZB.MOM.WW.ScadaBridge.Commons/Interfaces/Services/ISiteAuditQueue.cs` (`ReadPendingSinceAsync`)
- Read: `src/ZB.MOM.WW.ScadaBridge.Commons/Messages/Integration/` (`PullAuditEventsResponse`)
**Step 1:** `grep -rn "PullAuditEvents\|rpc Pull" --include=*.proto --include=*.cs src` — determine whether a site-side pull RPC + handler exist.
**Step 2:** Record the finding at the top of M1.1 (one of: "RPC exists → client-only", or "RPC missing → add proto + site handler first"). This decides whether M1.1 is 1 task or 3.
**Step 3:** No commit (investigation only); update this plan's M1.1 scope note.
---
### Task M1.1: Production `IPullAuditEventsClient` (gRPC)
**Classification:** high-risk (data contract + cross-cluster gRPC)
**Estimated implement time:** ~5 min (client only; +2 tasks if proto/site handler missing — see M1.0)
**Parallelizable with:** M1.5M1.8
**Files:**
- Create: `src/ZB.MOM.WW.ScadaBridge.AuditLog/Central/GrpcPullAuditEventsClient.cs`
- Modify: `src/ZB.MOM.WW.ScadaBridge.AuditLog/ServiceCollectionExtensions.cs` (replace the NoOp `IPullAuditEventsClient` binding inside the central path — likely a new `AddAuditLogCentralReconciliationClient` helper to keep the "every Add* is safe from any root" invariant)
- Test: `tests/ZB.MOM.WW.ScadaBridge.AuditLog.Tests/Central/GrpcPullAuditEventsClientTests.cs`
**Approach:** Implement `PullAsync(siteId, sinceUtc, batchSize, ct)` by resolving the per-site channel via the existing `SiteStreamGrpcClientFactory` (mirror `SiteStreamGrpcClient`), calling the site `PullAuditEvents` RPC, and mapping the proto reply into `PullAuditEventsResponse` (`Events` oldest-first + `MoreAvailable`). MUST NOT throw on tolerable transport faults (connection refused, deadline exceeded) — catch and return an empty response (per the interface contract docstring).
**Step 1:** Write failing test: a `PullAsync` against a stubbed site channel returns mapped events ordered oldest-first and surfaces `MoreAvailable`; a connection-refused fault returns an empty response (no throw).
**Step 2:** Run: `dotnet test tests/ZB.MOM.WW.ScadaBridge.AuditLog.Tests/ZB.MOM.WW.ScadaBridge.AuditLog.Tests.csproj --filter "FullyQualifiedName~GrpcPullAuditEventsClient"` → FAIL.
**Step 3:** Implement the client (+ DI helper). If M1.0 found the RPC missing, do M1.1a (add `rpc PullAuditEvents` to the proto + regenerate) and M1.1b (site handler reading `ISiteAuditQueue.ReadPendingSinceAsync`) first.
**Step 4:** Run the filter → PASS; then full project test.
**Step 5:** Commit: `feat(audit): production gRPC IPullAuditEventsClient for site reconciliation`.
---
### Task M1.2: Wire `SiteAuditReconciliationActor` + `AuditLogPurgeActor` as central singletons
**Classification:** high-risk (actor model, cluster singleton, runtime)
**Estimated implement time:** ~5 min
**Parallelizable with:** M1.5M1.8
**Depends on:** M1.1 (real client) — though wiring works against the NoOp too, so M1.2 can land first and M1.1 swaps the binding.
**Files:**
- Modify: `src/ZB.MOM.WW.ScadaBridge.Host/Actors/AkkaHostedService.cs` (in `RegisterCentralActors`, after the SiteCallAudit singleton block ~`:589`)
- Modify: `src/ZB.MOM.WW.ScadaBridge.Host/Program.cs` (ensure `AddAuditLogCentralMaintenance` + the reconciliation client helper are called on the central path — confirm against existing call site)
- Test: `tests/ZB.MOM.WW.ScadaBridge.Host.Tests/` (singleton-registration assertion) and/or `tests/ZB.MOM.WW.ScadaBridge.IntegrationTests/`
**Approach:** Mirror the SiteCallAudit singleton pattern (`:524-573`): build `ClusterSingletonManager.Props(Props.Create(() => new SiteAuditReconciliationActor(...)))` resolving `ISiteEnumerator`, `IPullAuditEventsClient`, root `IServiceProvider`, `IOptions<SiteAuditReconciliationOptions>`, logger from `_serviceProvider`; same for `AuditLogPurgeActor` (resolves `IServiceProvider`, `IOptions<AuditLogPurgeOptions>`, `IOptions<AuditLogOptions>`, logger). Register both on the central role with distinct singleton names (`site-audit-reconciliation`, `audit-log-purge`). Add CoordinatedShutdown graceful-stop hooks mirroring `:550-567`. Verify `SiteAuditReconciliationOptions`/`AuditLogPurgeOptions` are bound in the central composition root (add to `AddAuditLogCentralMaintenance` if not).
**Step 1:** Write failing test asserting the central host registers `audit-log-purge` and `site-audit-reconciliation` singleton managers (resolve actor selection or assert via a test probe in the integration harness).
**Step 2:** Run the test → FAIL.
**Step 3:** Implement the wiring; bind the options sections.
**Step 4:** Run test → PASS; `dotnet build ZB.MOM.WW.ScadaBridge.slnx`.
**Step 5:** Commit: `feat(audit): start AuditLogPurgeActor + SiteAuditReconciliationActor central singletons`.
---
### Task M1.3: Site Call Audit — periodic reconciliation pull
**Classification:** high-risk (actor + EF cursor read + cross-cluster)
**Estimated implement time:** ~5 min (may split: cursor-read repo method vs actor scheduler)
**Parallelizable with:** M1.5M1.8
**Files:**
- Modify: `src/ZB.MOM.WW.ScadaBridge.SiteCallAudit/SiteCallAuditActor.cs` (add a `ReconciliationTick` self-schedule + per-site pull, mirroring `SiteAuditReconciliationActor`)
- Modify: `src/ZB.MOM.WW.ScadaBridge.ConfigurationDatabase/Repositories/SiteCallAuditRepository.cs` (+ interface) — add a "changed-since cursor" read if absent
- Modify: `SiteCallAuditOptions` — add reconciliation interval + batch size
- Test: `tests/ZB.MOM.WW.ScadaBridge.SiteCallAudit.Tests/`
**Approach:** Add a timer-driven reconciliation pull that asks each site for `SiteCallOperational` rows changed since a per-site cursor and upserts them idempotently (`UpsertAsync` is already monotonic). Reuse the gRPC pull surface from M1.1 if it can carry SiteCall operational state; otherwise add a sibling pull. **Sub-decision (record at implement time):** confirm whether `PullAuditEventsResponse` carries SiteCall operational rows (the audit found it does not) — if not, extend the telemetry/pull contract additively or add a dedicated SiteCall pull RPC. Flag if this grows beyond ~300 LOC → split.
**Steps:** TDD as above (failing test: a row dropped from telemetry is back-filled on the next reconciliation tick) → implement → pass → build → commit `feat(sitecallaudit): periodic reconciliation pull back-fills lost telemetry`.
---
### Task M1.4: Site Call Audit — daily terminal-row purge scheduler
**Classification:** standard (actor scheduler + existing repo method)
**Estimated implement time:** ~3 min
**Parallelizable with:** M1.5M1.8
**Files:**
- Modify: `src/ZB.MOM.WW.ScadaBridge.SiteCallAudit/SiteCallAuditActor.cs` (add a `PurgeTick` timer invoking `ISiteCallAuditRepository.PurgeTerminalAsync`)
- Modify: `SiteCallAuditOptions` — add purge interval + retention days (default 365)
- Test: `tests/ZB.MOM.WW.ScadaBridge.SiteCallAudit.Tests/`
**Approach:** Mirror `AuditLogPurgeActor`'s `PurgeTick` cadence; resolve the scoped repo per tick; continue-on-error. **Step 1:** failing test — terminal rows older than retention are purged on tick. **Steps 25:** implement → pass → build → commit `feat(sitecallaudit): daily terminal-row purge scheduler`.
---
### Task M1.5: Site Event Logging — Alarm events
**Classification:** standard
**Estimated implement time:** ~4 min
**Parallelizable with:** M1.0M1.4, M1.6, M1.7, M1.8
**Files:**
- Modify: `src/ZB.MOM.WW.ScadaBridge.SiteRuntime/Actors/AlarmActor.cs` and `.../NativeAlarmActor.cs` (inject `ISiteEventLogger`; emit `alarm` events on raise/clear/ack transitions)
- Test: `tests/ZB.MOM.WW.ScadaBridge.SiteRuntime.Tests/`
**Approach:** Add the `ISiteEventLogger` dependency (confirm how SiteRuntime actors receive DI services — via `Props`/factory, mirror `DataConnectionActor`/`ScriptExecutionActor` which already log). Emit `LogEventAsync("alarm", severity, instanceId, source, message, details)` on state transitions. TDD: failing test asserts an alarm transition produces an `alarm` row → implement → pass → build → commit `feat(siteeventlog): emit alarm events`.
---
### Task M1.6: Site Event Logging — Deployment + Instance-lifecycle events
**Classification:** standard
**Estimated implement time:** ~4 min
**Parallelizable with:** M1.5, M1.7, M1.8
**Files:**
- Modify: `src/ZB.MOM.WW.ScadaBridge.SiteRuntime/Actors/DeploymentManagerActor.cs` (+ InstanceActor if instance-lifecycle lives there) — emit `deployment` + `instance_lifecycle` events on deploy/enable/disable/delete/apply.
- Test: `tests/ZB.MOM.WW.ScadaBridge.SiteRuntime.Tests/` (or DeploymentManager.Tests)
**Approach + steps:** as M1.5; commit `feat(siteeventlog): emit deployment + instance-lifecycle events`.
---
### Task M1.7: Site Event Logging — Store-and-Forward + Notification events
**Classification:** standard
**Estimated implement time:** ~4 min
**Parallelizable with:** M1.5, M1.6, M1.8
**Files:**
- Modify: Store-and-Forward engine (`src/ZB.MOM.WW.ScadaBridge.StoreAndForward/...StoreAndForwardService.cs`) — emit `store_and_forward` events on buffer/retry/park.
- Modify: site notification forwarding path — emit `notification` events (note: delivery is central; the site event is "forwarded"/"queued").
- Test: relevant `*.Tests`
**Approach + steps:** as M1.5; **add `notification` to the `eventType` doc enum** in `ISiteEventLogger` (the XML doc currently lists 6 categories, not notification). Commit `feat(siteeventlog): emit store-and-forward + notification events`.
---
### Task M1.8: Site Event Logging — script started/completed (Info)
**Classification:** small
**Estimated implement time:** ~3 min
**Parallelizable with:** M1.5M1.7
**Files:**
- Modify: `src/ZB.MOM.WW.ScadaBridge.SiteRuntime/Actors/ScriptExecutionActor.cs:238-256`, `ScriptActor.cs:368` — add `Info` started/completed events alongside the existing `Error` events.
- Test: `tests/ZB.MOM.WW.ScadaBridge.SiteRuntime.Tests/`
**Steps:** failing test asserts a successful script run emits started + completed Info rows → implement → pass → build → commit `feat(siteeventlog): emit script started/completed events`.
---
### Task M1.9: M1 integration verification + redeploy
**Classification:** standard (no new logic; end-to-end proof)
**Estimated implement time:** ~5 min
**Parallelizable with:** none (gate for M1)
**Depends on:** M1.1M1.8
**Steps:**
1. `dotnet build ZB.MOM.WW.ScadaBridge.slnx` → 0 errors.
2. `dotnet test` across `AuditLog.Tests`, `SiteCallAudit.Tests`, `SiteRuntime.Tests`, `IntegrationTests` → green (MSSQL-gated tests may skip; note skips).
3. `bash docker/deploy.sh`; confirm cluster healthy (`curl -s localhost:9000/health/ready`).
4. Spot-check logs for "Purged AuditLog partition" / reconciliation ticks / new event-log categories.
5. Commit (if any fixture cleanup): `test(m1): integration verification of audit wiring + event-log categories`.
---
# Milestone M2 — Correctness & behavioral gaps (Tier 2)
> Each task is TDD (write failing behavior test → implement → pass → commit). Tasks are right-sized; where the exact edit spans a method, the implementer reads the cited file:line first (the `Files:` block is the contract). Full code is specified at implement time against the live file — do not pre-fabricate.
### Task M2.1: `Database.CachedWrite` — immediate attempt + synchronous permanent-`Failed`
**Classification:** high-risk (trust-boundary behavior) · **~5 min** · **Parallelizable with:** M2.2M2.13
**Files:** `src/ZB.MOM.WW.ScadaBridge.ExternalSystemGateway/DatabaseGateway.cs:78-204`; test `tests/ZB.MOM.WW.ScadaBridge.ExternalSystemGateway.Tests/`.
**AC:** mirror `ExternalSystemClient.cs:100-161` — attempt immediately; permanent SQL error throws `PermanentExternalSystemException`-equivalent synchronously (`Failed`), only transient errors buffer. Failing test: a permanent SQL error returns `Failed` synchronously and is NOT enqueued.
### Task M2.2: Apply alarm `conditionFilter`
**Classification:** high-risk (DCL + OPC UA filter) · **~5 min** · **Parallelizable with:** others
**Files:** `RealOpcUaClient.cs:242,295` (build `WhereClause` from filter), `DataConnectionActor.cs:1482,1540-1554` (honor filter in routing), `MxGatewayDataConnection.cs:154-167`; tests in DCL test project.
**AC:** a non-null `conditionFilter` mirrors only matching conditions. Failing test: filtered subscription drops non-matching conditions.
### Task M2.3: Per-script execution timeout
**Classification:** standard · **~5 min** · split if model change is large
**Files:** `TemplateScript` entity (+ flattened config type), `SiteRuntimeOptions.cs:31`, `ScriptExecutionActor.cs:100`, `AlarmExecutionActor.cs:66`; tests.
**AC:** a per-script timeout overrides the global default; falls back when unset.
### Task M2.4: Connection-level diff surfaces in deployment diff
**Classification:** standard · **~5 min**
**Files:** `Commons/Types/Flattening/ConfigurationDiff.cs:7-24` (add `ConnectionChanges` slot + `HasChanges`), `DiffService.cs:45-53` (call `ComputeConnectionsDiff` from `ComputeDiff`), CentralUI diff component; tests.
**AC:** a connection endpoint/protocol/failover change appears in the diff and UI.
### Task M2.5: `MachineDataDb` fail-fast
**Classification:** standard · **~3 min**
**Files:** `DatabaseOptions.cs:6-12` (add property), `StartupValidator.cs:60-61` (validate non-empty on central), DbContext only if consumed; tests.
**AC:** central node fails startup with a clear message when `MachineDataDb` is empty.
### Task M2.6: CI grep-guard against `UPDATE/DELETE … AuditLog`
**Classification:** small · **~3 min**
**Files:** new build script (e.g. `tools/check-auditlog-append-only.sh`) + a `.csproj`/CI hook or an MSBuild target; a test that the guard trips on a planted violation.
**AC:** build/CI fails if a data-layer file contains `UPDATE`/`DELETE` against `AuditLog`.
### Task M2.7: LDAP periodic re-query on session refresh
**Classification:** high-risk (security/session) · **~5 min**
**Files:** `JwtTokenService` (`RefreshToken`/`ShouldRefresh`/`IsIdleTimedOut` — wire callers), or an `OnValidatePrincipal` cookie-revalidation hook in `Security/ServiceCollectionExtensions.cs`; tests with adversarial coverage.
**AC:** interactive session roles are re-queried from LDAP at the refresh boundary (never >15 min stale); LDAP failure leaves the active session on prior roles (per spec).
### Tasks M2.8M2.13: low-severity batch (one task each, `small`/`standard`, mostly parallelizable)
- **M2.8** Return-type compatibility check — `SemanticValidator.cs:62-63,279-287` (wire the built `*ReturnMap` into a comparison).
- **M2.9** Argument *type* compatibility — `SemanticValidator.cs:251-266,390-425` (parse + compare arg types, not just count).
- **M2.10** Native-alarm-source capability validation wired into deploy pipeline — `SemanticValidator.cs:239-245`, `FlatteningPipeline.cs:93,115` (supply `alarmCapableConnectionNames`).
- **M2.11** Binding-completeness as deploy-gating Error (+ "name exists at site") — `ValidationService.cs:504-519`, `ValidationResult.cs:9`.
- **M2.12** Debug snapshot/subscribe error for unknown instance — `DeploymentManagerActor.cs:845-866` (return error reply, not empty snapshot).
- **M2.13** Misc: recursion-limit → site event log (`ScriptRuntimeContext.cs:302-305,464-466`); debug-stream ordering + timestamp-dedup replay (`DebugStreamBridgeActor.cs:89-103`); OPC UA transition field population (`RealOpcUaClient.cs:395-403`); readiness "required singletons" probe (`Program.cs:188-201`); register SiteEventLog active-node purge gate (`SiteEventLogging/ServiceCollectionExtensions.cs:33-37`); consume `FailedWriteCount` in Health Monitoring (`ISiteEventLogger.cs:32-40`); reconcile `StateTransitionValidator` delete-from-`NotDeployed` (`StateTransitionValidator.cs:38-39`). **Split each into its own task at execution time** — they are independent and parallelizable; grouped here only to keep the plan readable.
### Task M2.14: nested `Object`/`List` type validation (Inbound API, from M4 disposition)
**Classification:** standard · **~5 min**
**Files:** `ParameterValidator.cs:109-145`, `ReturnValueValidator.cs:18`; tests.
**AC:** nested field/element types are validated, not just `ValueKind`.
---
# Milestone M3 — Script trust boundary (Tier 1 #1, #2)
### Task M3.1: Real Roslyn compile in `ScriptCompiler.TryCompile`
**Classification:** high-risk (validation gate, security) · **~5 min** (may split: compile vs diagnostics mapping)
**Files:** `src/ZB.MOM.WW.ScadaBridge.TemplateEngine/Validation/ScriptCompiler.cs:56-104`, `.csproj` (confirm `Microsoft.CodeAnalysis.CSharp.Scripting` ref — it's in `Directory.Packages.props`), `ValidationService.cs:128`; tests.
**AC:** semantically-invalid C# (undefined symbol, type error) fails validation with a useful diagnostic; valid scripts pass. Failing test: a script referencing an undefined symbol fails to validate.
### Task M3.2: Semantic forbidden-API enforcement (replace substring scan)
**Classification:** high-risk (security boundary) · **~5 min**
**Files:** `ScriptCompiler.cs:14-22,61-72`, coordinate with Site Runtime sandbox; tests with an adversarial corpus.
**AC:** alias / `using static` / `global::` paths to a forbidden API are detected via Roslyn symbol resolution and block deploy. Adversarial bypass tests must FAIL to deploy.
### Task M3.3: Real compile for shared scripts
**Classification:** standard · **~4 min**
**Files:** `SharedScriptService.cs:168-206`; tests. **AC:** shared scripts get the same semantic compile.
### Task M3.4: Fixture cleanup + verification
**Classification:** standard · **~5 min** · **Depends on:** M3.1M3.3
**Steps:** real compile may flag latent-invalid scripts in existing templates/test fixtures — fix them; run `dotnet test` for TemplateEngine + integration; commit.
---
# Milestone M4 — Doc reconciliation (Tier 4, doc-only, parallelizable)
> All `trivial`/`small`, doc-only, no test impact (except M2.14 which is code, planned in M2). Each is one commit.
### Task M4.1: Config-DB / Commons spec re-architecture
**Files:** `docs/requirements/Component-ConfigurationDatabase.md` (collapsed `AuditLog` schema → canonical + `DetailsJson` + computed cols), `Component-Commons.md` (`AuditEvent``ZB.MOM.WW.Audit` package, `ApiKey` retirement, undocumented types/interfaces, `SiteCall` field names). **AC:** spec matches `AuditLogRow.cs` + migrations.
### Task M4.2: Inbound API + Security + Notification spec drift
**Files:** `Component-InboundAPI.md` (Bearer auth, fire-and-forget audit timing, type validation now built per M2.14), `Component-Security.md` (cookie-only session model, role names Administrator/Designer/Deployer/Viewer), `Component-NotificationService.md` / Commons (`NotificationType` is Email-only; Teams future), `AuditKind` vocabulary (`ApiInbound.Completed``InboundRequest`/`InboundAuthFailure`; `ExecuteReader``DbWrite`).
### Task M4.3: CLI docs
**Files:** `src/ZB.MOM.WW.ScadaBridge.CLI/README.md` + `docs/requirements/Component-CLI.md` — document the `bundle` group; fix README option drift (`api-key create --methods`, `--key-id` vs `--id`, `api-method create --script`, `db-connection` no `--provider`, `set-methods`, scope-rule/health/template option names, `audit query` options). **AC:** every documented command/option matches a registered one.
### Task M4.4: Clear stale "deferred"/"no-op" markers for shipped features
**Files:** comments in `SiteCallAudit/ServiceCollectionExtensions.cs:11-13` (relay shipped), `BundleImporter.cs:28-30`, `AuditingDbCommand.cs:466-467,512` + `ScriptRuntimeContext.cs:1808` (M5 redaction shipped), `AuditLogPage.razor.cs:16-18` (HandleRowSelected wired), Transport design doc §13 (CLI shipped), `requirements-traceability.md` "Pending" rows, and the stale `.tasks.json` notes. **AC:** no comment claims "deferred/not implemented" for a shipped feature.
---
## Native tasks & dependencies
M1 tasks are created as native tasks (CLI-visible). M2/M3/M4 remain under the existing umbrella tasks (#13/#14/#15) and are broken into per-item native tasks at the start of each milestone's execution (especially the M2.8M2.13 split). Dependency edges: M1.9 ⟵ M1.1M1.8; M5 (#16) ⟵ M1 (#12), already set.
## Notes / risks carried forward
- **M1.1 proto risk** (M1.0 resolves it) — if the site pull RPC is missing, M1.1 grows by two sub-tasks (proto + site handler).
- **M1.3 contract risk** — `PullAuditEventsResponse` likely does not carry SiteCall operational state; an additive contract extension or a dedicated SiteCall pull may be needed. Confirm at implement time; split if >300 LOC.
- **M3.1 fixture risk** — real compile may surface latent-invalid existing scripts (budgeted as M3.4).
- Keep `git diff` review before each commit; rebuild the image (`bash docker/deploy.sh`) for any M1 runtime change before claiming done.