From 9aa125950434f9b102918bec8dd8f53a1c73979c Mon Sep 17 00:00:00 2001 From: Joseph Doherty Date: Mon, 15 Jun 2026 09:32:14 -0400 Subject: [PATCH] docs(plans): Phase 1 (M1-M4) implementation plan for stillpending.md Bite-sized TDD plan. M1 (runtime wiring) fully detailed across 10 tasks after verifying the purge/reconciliation actors already exist and only need Host wiring + a gRPC pull client + event-logger injection. M2/M3/M4 as right-sized task inventories with files, classification, and AC. Co-located .tasks.json for executing-plans resume. --- ...6-15-stillpending-phase1-implementation.md | 304 ++++++++++++++++++ ...ending-phase1-implementation.md.tasks.json | 19 ++ 2 files changed, 323 insertions(+) create mode 100644 docs/plans/2026-06-15-stillpending-phase1-implementation.md create mode 100644 docs/plans/2026-06-15-stillpending-phase1-implementation.md.tasks.json diff --git a/docs/plans/2026-06-15-stillpending-phase1-implementation.md b/docs/plans/2026-06-15-stillpending-phase1-implementation.md new file mode 100644 index 00000000..4f7d8305 --- /dev/null +++ b/docs/plans/2026-06-15-stillpending-phase1-implementation.md @@ -0,0 +1,304 @@ +# stillpending.md Phase 1 (M1–M4) Implementation Plan + +> **For Claude:** REQUIRED SUB-SKILL: Use superpowers-extended-cc:executing-plans to implement this plan task-by-task. + +**Goal:** Stabilize ScadaBridge — make the Tier-1 silent gaps actually run, correct the Tier-2 behavioral divergences, and reconcile the Tier-4 doc↔code drift, per `docs/plans/2026-06-15-stillpending-completion-design.md`. + +**Architecture:** Risk-first. M1 wires already-implemented-but-never-started central actors + fills the site event-log categories. M2 corrects behavioral gaps with targeted, test-first edits. M3 replaces the fake script "compiler" with a real Roslyn compile + semantic forbidden-API enforcement. M4 is doc-only reconciliation. Each task is independently shippable; spec + code + tests + deploy travel together (CLAUDE.md). + +**Tech Stack:** C#/.NET 10, Akka.NET 1.5 (cluster singletons), EF Core 10 (MS SQL + SQLite), gRPC (sitestream.proto), Roslyn (`Microsoft.CodeAnalysis.CSharp.Scripting` — already a referenced package), xUnit + FluentAssertions + NSubstitute/Moq, Blazor Server. + +**Build/test commands (repo-wide):** +- Build: `dotnet build ZB.MOM.WW.ScadaBridge.slnx` +- Test one project: `dotnet test tests//.csproj` +- Filter one test: `dotnet test tests//.csproj --filter "FullyQualifiedName~"` +- Cluster runtime change: rebuild image with `bash docker/deploy.sh` + +--- + +## Discovery (verified before planning) + +- `AuditLogPurgeActor` (`src/ZB.MOM.WW.ScadaBridge.AuditLog/Central/AuditLogPurgeActor.cs`) and `SiteAuditReconciliationActor` (`.../Central/SiteAuditReconciliationActor.cs`) are **fully implemented** — timers, cursors, stalled detection, per-row retry budget, error isolation. They are simply **never instantiated**. M1 wires them; it does not write them. +- `IPullAuditEventsClient` (`.../Central/IPullAuditEventsClient.cs`) has a documented NoOp default and no production gRPC implementation. The reconciliation actor runs harmlessly against the NoOp today. +- Central singletons are created in `AkkaHostedService.RegisterCentralActors` (`src/ZB.MOM.WW.ScadaBridge.Host/Actors/AkkaHostedService.cs:~592`), using the `ClusterSingletonManager` + `ClusterSingletonProxy` pattern (see AuditLogIngest `:460`, SiteCallAudit `:524`). This is the insertion point for M1.1/M1.2. +- `AddAuditLogCentralMaintenance` (`.../AuditLog/ServiceCollectionExtensions.cs:319`) already registers the partition-roll-*forward* hosted service + central health snapshot; it does NOT register the purge/reconciliation **actors** (those need the ActorSystem, so they're Host-wired, not DI). +- `ISiteEventLogger.LogEventAsync(eventType, severity, instanceId, source, message, details?)` (`.../SiteEventLogging/ISiteEventLogger.cs`) is the emit surface. Only `connection` + `script` (error-path) categories are emitted today. +- `SiteCallAuditActor` (`.../SiteCallAudit/SiteCallAuditActor.cs:24-34`) explicitly defers its reconciliation puller + purge scheduler; `SiteCallAuditRepository.PurgeTerminalAsync` (`.../ConfigurationDatabase/Repositories/SiteCallAuditRepository.cs:213`) exists but is never invoked. + +**Open risk flagged for M1.1:** whether `sitestream.proto` already exposes a server-streaming/unary `PullAuditEvents` RPC and a site-side handler reading `ISiteAuditQueue.ReadPendingSinceAsync`. Task M1.1 begins by confirming this; if absent, the proto + site handler are added first (sub-tasks M1.1a/b). Do not assume. + +--- + +# Milestone M1 — Runtime wiring (Tier 1 #3, #4, #5, #6) + +Wire behavior that exists but never starts, and fill the event-log categories. + +### Task M1.0: Confirm proto/site surface for audit pull (spike) + +**Classification:** trivial (investigation; no production code) +**Estimated implement time:** ~4 min +**Parallelizable with:** M1.5, M1.6, M1.7, M1.8 (event-logging tasks) + +**Files:** +- Read: `src/ZB.MOM.WW.ScadaBridge.Communication/**/sitestream.proto` (or wherever the proto lives — `find . -name "*.proto"`) +- Read: `src/ZB.MOM.WW.ScadaBridge.Commons/Interfaces/Services/ISiteAuditQueue.cs` (`ReadPendingSinceAsync`) +- Read: `src/ZB.MOM.WW.ScadaBridge.Commons/Messages/Integration/` (`PullAuditEventsResponse`) + +**Step 1:** `grep -rn "PullAuditEvents\|rpc Pull" --include=*.proto --include=*.cs src` — determine whether a site-side pull RPC + handler exist. +**Step 2:** Record the finding at the top of M1.1 (one of: "RPC exists → client-only", or "RPC missing → add proto + site handler first"). This decides whether M1.1 is 1 task or 3. +**Step 3:** No commit (investigation only); update this plan's M1.1 scope note. + +--- + +### Task M1.1: Production `IPullAuditEventsClient` (gRPC) + +**Classification:** high-risk (data contract + cross-cluster gRPC) +**Estimated implement time:** ~5 min (client only; +2 tasks if proto/site handler missing — see M1.0) +**Parallelizable with:** M1.5–M1.8 + +**Files:** +- Create: `src/ZB.MOM.WW.ScadaBridge.AuditLog/Central/GrpcPullAuditEventsClient.cs` +- Modify: `src/ZB.MOM.WW.ScadaBridge.AuditLog/ServiceCollectionExtensions.cs` (replace the NoOp `IPullAuditEventsClient` binding inside the central path — likely a new `AddAuditLogCentralReconciliationClient` helper to keep the "every Add* is safe from any root" invariant) +- Test: `tests/ZB.MOM.WW.ScadaBridge.AuditLog.Tests/Central/GrpcPullAuditEventsClientTests.cs` + +**Approach:** Implement `PullAsync(siteId, sinceUtc, batchSize, ct)` by resolving the per-site channel via the existing `SiteStreamGrpcClientFactory` (mirror `SiteStreamGrpcClient`), calling the site `PullAuditEvents` RPC, and mapping the proto reply into `PullAuditEventsResponse` (`Events` oldest-first + `MoreAvailable`). MUST NOT throw on tolerable transport faults (connection refused, deadline exceeded) — catch and return an empty response (per the interface contract docstring). + +**Step 1:** Write failing test: a `PullAsync` against a stubbed site channel returns mapped events ordered oldest-first and surfaces `MoreAvailable`; a connection-refused fault returns an empty response (no throw). +**Step 2:** Run: `dotnet test tests/ZB.MOM.WW.ScadaBridge.AuditLog.Tests/ZB.MOM.WW.ScadaBridge.AuditLog.Tests.csproj --filter "FullyQualifiedName~GrpcPullAuditEventsClient"` → FAIL. +**Step 3:** Implement the client (+ DI helper). If M1.0 found the RPC missing, do M1.1a (add `rpc PullAuditEvents` to the proto + regenerate) and M1.1b (site handler reading `ISiteAuditQueue.ReadPendingSinceAsync`) first. +**Step 4:** Run the filter → PASS; then full project test. +**Step 5:** Commit: `feat(audit): production gRPC IPullAuditEventsClient for site reconciliation`. + +--- + +### Task M1.2: Wire `SiteAuditReconciliationActor` + `AuditLogPurgeActor` as central singletons + +**Classification:** high-risk (actor model, cluster singleton, runtime) +**Estimated implement time:** ~5 min +**Parallelizable with:** M1.5–M1.8 +**Depends on:** M1.1 (real client) — though wiring works against the NoOp too, so M1.2 can land first and M1.1 swaps the binding. + +**Files:** +- Modify: `src/ZB.MOM.WW.ScadaBridge.Host/Actors/AkkaHostedService.cs` (in `RegisterCentralActors`, after the SiteCallAudit singleton block ~`:589`) +- Modify: `src/ZB.MOM.WW.ScadaBridge.Host/Program.cs` (ensure `AddAuditLogCentralMaintenance` + the reconciliation client helper are called on the central path — confirm against existing call site) +- Test: `tests/ZB.MOM.WW.ScadaBridge.Host.Tests/` (singleton-registration assertion) and/or `tests/ZB.MOM.WW.ScadaBridge.IntegrationTests/` + +**Approach:** Mirror the SiteCallAudit singleton pattern (`:524-573`): build `ClusterSingletonManager.Props(Props.Create(() => new SiteAuditReconciliationActor(...)))` resolving `ISiteEnumerator`, `IPullAuditEventsClient`, root `IServiceProvider`, `IOptions`, logger from `_serviceProvider`; same for `AuditLogPurgeActor` (resolves `IServiceProvider`, `IOptions`, `IOptions`, logger). Register both on the central role with distinct singleton names (`site-audit-reconciliation`, `audit-log-purge`). Add CoordinatedShutdown graceful-stop hooks mirroring `:550-567`. Verify `SiteAuditReconciliationOptions`/`AuditLogPurgeOptions` are bound in the central composition root (add to `AddAuditLogCentralMaintenance` if not). + +**Step 1:** Write failing test asserting the central host registers `audit-log-purge` and `site-audit-reconciliation` singleton managers (resolve actor selection or assert via a test probe in the integration harness). +**Step 2:** Run the test → FAIL. +**Step 3:** Implement the wiring; bind the options sections. +**Step 4:** Run test → PASS; `dotnet build ZB.MOM.WW.ScadaBridge.slnx`. +**Step 5:** Commit: `feat(audit): start AuditLogPurgeActor + SiteAuditReconciliationActor central singletons`. + +--- + +### Task M1.3: Site Call Audit — periodic reconciliation pull + +**Classification:** high-risk (actor + EF cursor read + cross-cluster) +**Estimated implement time:** ~5 min (may split: cursor-read repo method vs actor scheduler) +**Parallelizable with:** M1.5–M1.8 + +**Files:** +- Modify: `src/ZB.MOM.WW.ScadaBridge.SiteCallAudit/SiteCallAuditActor.cs` (add a `ReconciliationTick` self-schedule + per-site pull, mirroring `SiteAuditReconciliationActor`) +- Modify: `src/ZB.MOM.WW.ScadaBridge.ConfigurationDatabase/Repositories/SiteCallAuditRepository.cs` (+ interface) — add a "changed-since cursor" read if absent +- Modify: `SiteCallAuditOptions` — add reconciliation interval + batch size +- Test: `tests/ZB.MOM.WW.ScadaBridge.SiteCallAudit.Tests/` + +**Approach:** Add a timer-driven reconciliation pull that asks each site for `SiteCallOperational` rows changed since a per-site cursor and upserts them idempotently (`UpsertAsync` is already monotonic). Reuse the gRPC pull surface from M1.1 if it can carry SiteCall operational state; otherwise add a sibling pull. **Sub-decision (record at implement time):** confirm whether `PullAuditEventsResponse` carries SiteCall operational rows (the audit found it does not) — if not, extend the telemetry/pull contract additively or add a dedicated SiteCall pull RPC. Flag if this grows beyond ~300 LOC → split. +**Steps:** TDD as above (failing test: a row dropped from telemetry is back-filled on the next reconciliation tick) → implement → pass → build → commit `feat(sitecallaudit): periodic reconciliation pull back-fills lost telemetry`. + +--- + +### Task M1.4: Site Call Audit — daily terminal-row purge scheduler + +**Classification:** standard (actor scheduler + existing repo method) +**Estimated implement time:** ~3 min +**Parallelizable with:** M1.5–M1.8 + +**Files:** +- Modify: `src/ZB.MOM.WW.ScadaBridge.SiteCallAudit/SiteCallAuditActor.cs` (add a `PurgeTick` timer invoking `ISiteCallAuditRepository.PurgeTerminalAsync`) +- Modify: `SiteCallAuditOptions` — add purge interval + retention days (default 365) +- Test: `tests/ZB.MOM.WW.ScadaBridge.SiteCallAudit.Tests/` + +**Approach:** Mirror `AuditLogPurgeActor`'s `PurgeTick` cadence; resolve the scoped repo per tick; continue-on-error. **Step 1:** failing test — terminal rows older than retention are purged on tick. **Steps 2–5:** implement → pass → build → commit `feat(sitecallaudit): daily terminal-row purge scheduler`. + +--- + +### Task M1.5: Site Event Logging — Alarm events + +**Classification:** standard +**Estimated implement time:** ~4 min +**Parallelizable with:** M1.0–M1.4, M1.6, M1.7, M1.8 + +**Files:** +- Modify: `src/ZB.MOM.WW.ScadaBridge.SiteRuntime/Actors/AlarmActor.cs` and `.../NativeAlarmActor.cs` (inject `ISiteEventLogger`; emit `alarm` events on raise/clear/ack transitions) +- Test: `tests/ZB.MOM.WW.ScadaBridge.SiteRuntime.Tests/` + +**Approach:** Add the `ISiteEventLogger` dependency (confirm how SiteRuntime actors receive DI services — via `Props`/factory, mirror `DataConnectionActor`/`ScriptExecutionActor` which already log). Emit `LogEventAsync("alarm", severity, instanceId, source, message, details)` on state transitions. TDD: failing test asserts an alarm transition produces an `alarm` row → implement → pass → build → commit `feat(siteeventlog): emit alarm events`. + +--- + +### Task M1.6: Site Event Logging — Deployment + Instance-lifecycle events + +**Classification:** standard +**Estimated implement time:** ~4 min +**Parallelizable with:** M1.5, M1.7, M1.8 + +**Files:** +- Modify: `src/ZB.MOM.WW.ScadaBridge.SiteRuntime/Actors/DeploymentManagerActor.cs` (+ InstanceActor if instance-lifecycle lives there) — emit `deployment` + `instance_lifecycle` events on deploy/enable/disable/delete/apply. +- Test: `tests/ZB.MOM.WW.ScadaBridge.SiteRuntime.Tests/` (or DeploymentManager.Tests) + +**Approach + steps:** as M1.5; commit `feat(siteeventlog): emit deployment + instance-lifecycle events`. + +--- + +### Task M1.7: Site Event Logging — Store-and-Forward + Notification events + +**Classification:** standard +**Estimated implement time:** ~4 min +**Parallelizable with:** M1.5, M1.6, M1.8 + +**Files:** +- Modify: Store-and-Forward engine (`src/ZB.MOM.WW.ScadaBridge.StoreAndForward/...StoreAndForwardService.cs`) — emit `store_and_forward` events on buffer/retry/park. +- Modify: site notification forwarding path — emit `notification` events (note: delivery is central; the site event is "forwarded"/"queued"). +- Test: relevant `*.Tests` + +**Approach + steps:** as M1.5; **add `notification` to the `eventType` doc enum** in `ISiteEventLogger` (the XML doc currently lists 6 categories, not notification). Commit `feat(siteeventlog): emit store-and-forward + notification events`. + +--- + +### Task M1.8: Site Event Logging — script started/completed (Info) + +**Classification:** small +**Estimated implement time:** ~3 min +**Parallelizable with:** M1.5–M1.7 + +**Files:** +- Modify: `src/ZB.MOM.WW.ScadaBridge.SiteRuntime/Actors/ScriptExecutionActor.cs:238-256`, `ScriptActor.cs:368` — add `Info` started/completed events alongside the existing `Error` events. +- Test: `tests/ZB.MOM.WW.ScadaBridge.SiteRuntime.Tests/` + +**Steps:** failing test asserts a successful script run emits started + completed Info rows → implement → pass → build → commit `feat(siteeventlog): emit script started/completed events`. + +--- + +### Task M1.9: M1 integration verification + redeploy + +**Classification:** standard (no new logic; end-to-end proof) +**Estimated implement time:** ~5 min +**Parallelizable with:** none (gate for M1) +**Depends on:** M1.1–M1.8 + +**Steps:** +1. `dotnet build ZB.MOM.WW.ScadaBridge.slnx` → 0 errors. +2. `dotnet test` across `AuditLog.Tests`, `SiteCallAudit.Tests`, `SiteRuntime.Tests`, `IntegrationTests` → green (MSSQL-gated tests may skip; note skips). +3. `bash docker/deploy.sh`; confirm cluster healthy (`curl -s localhost:9000/health/ready`). +4. Spot-check logs for "Purged AuditLog partition" / reconciliation ticks / new event-log categories. +5. Commit (if any fixture cleanup): `test(m1): integration verification of audit wiring + event-log categories`. + +--- + +# Milestone M2 — Correctness & behavioral gaps (Tier 2) + +> Each task is TDD (write failing behavior test → implement → pass → commit). Tasks are right-sized; where the exact edit spans a method, the implementer reads the cited file:line first (the `Files:` block is the contract). Full code is specified at implement time against the live file — do not pre-fabricate. + +### Task M2.1: `Database.CachedWrite` — immediate attempt + synchronous permanent-`Failed` +**Classification:** high-risk (trust-boundary behavior) · **~5 min** · **Parallelizable with:** M2.2–M2.13 +**Files:** `src/ZB.MOM.WW.ScadaBridge.ExternalSystemGateway/DatabaseGateway.cs:78-204`; test `tests/ZB.MOM.WW.ScadaBridge.ExternalSystemGateway.Tests/`. +**AC:** mirror `ExternalSystemClient.cs:100-161` — attempt immediately; permanent SQL error throws `PermanentExternalSystemException`-equivalent synchronously (`Failed`), only transient errors buffer. Failing test: a permanent SQL error returns `Failed` synchronously and is NOT enqueued. + +### Task M2.2: Apply alarm `conditionFilter` +**Classification:** high-risk (DCL + OPC UA filter) · **~5 min** · **Parallelizable with:** others +**Files:** `RealOpcUaClient.cs:242,295` (build `WhereClause` from filter), `DataConnectionActor.cs:1482,1540-1554` (honor filter in routing), `MxGatewayDataConnection.cs:154-167`; tests in DCL test project. +**AC:** a non-null `conditionFilter` mirrors only matching conditions. Failing test: filtered subscription drops non-matching conditions. + +### Task M2.3: Per-script execution timeout +**Classification:** standard · **~5 min** · split if model change is large +**Files:** `TemplateScript` entity (+ flattened config type), `SiteRuntimeOptions.cs:31`, `ScriptExecutionActor.cs:100`, `AlarmExecutionActor.cs:66`; tests. +**AC:** a per-script timeout overrides the global default; falls back when unset. + +### Task M2.4: Connection-level diff surfaces in deployment diff +**Classification:** standard · **~5 min** +**Files:** `Commons/Types/Flattening/ConfigurationDiff.cs:7-24` (add `ConnectionChanges` slot + `HasChanges`), `DiffService.cs:45-53` (call `ComputeConnectionsDiff` from `ComputeDiff`), CentralUI diff component; tests. +**AC:** a connection endpoint/protocol/failover change appears in the diff and UI. + +### Task M2.5: `MachineDataDb` fail-fast +**Classification:** standard · **~3 min** +**Files:** `DatabaseOptions.cs:6-12` (add property), `StartupValidator.cs:60-61` (validate non-empty on central), DbContext only if consumed; tests. +**AC:** central node fails startup with a clear message when `MachineDataDb` is empty. + +### Task M2.6: CI grep-guard against `UPDATE/DELETE … AuditLog` +**Classification:** small · **~3 min** +**Files:** new build script (e.g. `tools/check-auditlog-append-only.sh`) + a `.csproj`/CI hook or an MSBuild target; a test that the guard trips on a planted violation. +**AC:** build/CI fails if a data-layer file contains `UPDATE`/`DELETE` against `AuditLog`. + +### Task M2.7: LDAP periodic re-query on session refresh +**Classification:** high-risk (security/session) · **~5 min** +**Files:** `JwtTokenService` (`RefreshToken`/`ShouldRefresh`/`IsIdleTimedOut` — wire callers), or an `OnValidatePrincipal` cookie-revalidation hook in `Security/ServiceCollectionExtensions.cs`; tests with adversarial coverage. +**AC:** interactive session roles are re-queried from LDAP at the refresh boundary (never >15 min stale); LDAP failure leaves the active session on prior roles (per spec). + +### Tasks M2.8–M2.13: low-severity batch (one task each, `small`/`standard`, mostly parallelizable) +- **M2.8** Return-type compatibility check — `SemanticValidator.cs:62-63,279-287` (wire the built `*ReturnMap` into a comparison). +- **M2.9** Argument *type* compatibility — `SemanticValidator.cs:251-266,390-425` (parse + compare arg types, not just count). +- **M2.10** Native-alarm-source capability validation wired into deploy pipeline — `SemanticValidator.cs:239-245`, `FlatteningPipeline.cs:93,115` (supply `alarmCapableConnectionNames`). +- **M2.11** Binding-completeness as deploy-gating Error (+ "name exists at site") — `ValidationService.cs:504-519`, `ValidationResult.cs:9`. +- **M2.12** Debug snapshot/subscribe error for unknown instance — `DeploymentManagerActor.cs:845-866` (return error reply, not empty snapshot). +- **M2.13** Misc: recursion-limit → site event log (`ScriptRuntimeContext.cs:302-305,464-466`); debug-stream ordering + timestamp-dedup replay (`DebugStreamBridgeActor.cs:89-103`); OPC UA transition field population (`RealOpcUaClient.cs:395-403`); readiness "required singletons" probe (`Program.cs:188-201`); register SiteEventLog active-node purge gate (`SiteEventLogging/ServiceCollectionExtensions.cs:33-37`); consume `FailedWriteCount` in Health Monitoring (`ISiteEventLogger.cs:32-40`); reconcile `StateTransitionValidator` delete-from-`NotDeployed` (`StateTransitionValidator.cs:38-39`). **Split each into its own task at execution time** — they are independent and parallelizable; grouped here only to keep the plan readable. + +### Task M2.14: nested `Object`/`List` type validation (Inbound API, from M4 disposition) +**Classification:** standard · **~5 min** +**Files:** `ParameterValidator.cs:109-145`, `ReturnValueValidator.cs:18`; tests. +**AC:** nested field/element types are validated, not just `ValueKind`. + +--- + +# Milestone M3 — Script trust boundary (Tier 1 #1, #2) + +### Task M3.1: Real Roslyn compile in `ScriptCompiler.TryCompile` +**Classification:** high-risk (validation gate, security) · **~5 min** (may split: compile vs diagnostics mapping) +**Files:** `src/ZB.MOM.WW.ScadaBridge.TemplateEngine/Validation/ScriptCompiler.cs:56-104`, `.csproj` (confirm `Microsoft.CodeAnalysis.CSharp.Scripting` ref — it's in `Directory.Packages.props`), `ValidationService.cs:128`; tests. +**AC:** semantically-invalid C# (undefined symbol, type error) fails validation with a useful diagnostic; valid scripts pass. Failing test: a script referencing an undefined symbol fails to validate. + +### Task M3.2: Semantic forbidden-API enforcement (replace substring scan) +**Classification:** high-risk (security boundary) · **~5 min** +**Files:** `ScriptCompiler.cs:14-22,61-72`, coordinate with Site Runtime sandbox; tests with an adversarial corpus. +**AC:** alias / `using static` / `global::` paths to a forbidden API are detected via Roslyn symbol resolution and block deploy. Adversarial bypass tests must FAIL to deploy. + +### Task M3.3: Real compile for shared scripts +**Classification:** standard · **~4 min** +**Files:** `SharedScriptService.cs:168-206`; tests. **AC:** shared scripts get the same semantic compile. + +### Task M3.4: Fixture cleanup + verification +**Classification:** standard · **~5 min** · **Depends on:** M3.1–M3.3 +**Steps:** real compile may flag latent-invalid scripts in existing templates/test fixtures — fix them; run `dotnet test` for TemplateEngine + integration; commit. + +--- + +# Milestone M4 — Doc reconciliation (Tier 4, doc-only, parallelizable) + +> All `trivial`/`small`, doc-only, no test impact (except M2.14 which is code, planned in M2). Each is one commit. + +### Task M4.1: Config-DB / Commons spec re-architecture +**Files:** `docs/requirements/Component-ConfigurationDatabase.md` (collapsed `AuditLog` schema → canonical + `DetailsJson` + computed cols), `Component-Commons.md` (`AuditEvent`→`ZB.MOM.WW.Audit` package, `ApiKey` retirement, undocumented types/interfaces, `SiteCall` field names). **AC:** spec matches `AuditLogRow.cs` + migrations. + +### Task M4.2: Inbound API + Security + Notification spec drift +**Files:** `Component-InboundAPI.md` (Bearer auth, fire-and-forget audit timing, type validation now built per M2.14), `Component-Security.md` (cookie-only session model, role names Administrator/Designer/Deployer/Viewer), `Component-NotificationService.md` / Commons (`NotificationType` is Email-only; Teams future), `AuditKind` vocabulary (`ApiInbound.Completed` → `InboundRequest`/`InboundAuthFailure`; `ExecuteReader`→`DbWrite`). + +### Task M4.3: CLI docs +**Files:** `src/ZB.MOM.WW.ScadaBridge.CLI/README.md` + `docs/requirements/Component-CLI.md` — document the `bundle` group; fix README option drift (`api-key create --methods`, `--key-id` vs `--id`, `api-method create --script`, `db-connection` no `--provider`, `set-methods`, scope-rule/health/template option names, `audit query` options). **AC:** every documented command/option matches a registered one. + +### Task M4.4: Clear stale "deferred"/"no-op" markers for shipped features +**Files:** comments in `SiteCallAudit/ServiceCollectionExtensions.cs:11-13` (relay shipped), `BundleImporter.cs:28-30`, `AuditingDbCommand.cs:466-467,512` + `ScriptRuntimeContext.cs:1808` (M5 redaction shipped), `AuditLogPage.razor.cs:16-18` (HandleRowSelected wired), Transport design doc §13 (CLI shipped), `requirements-traceability.md` "Pending" rows, and the stale `.tasks.json` notes. **AC:** no comment claims "deferred/not implemented" for a shipped feature. + +--- + +## Native tasks & dependencies + +M1 tasks are created as native tasks (CLI-visible). M2/M3/M4 remain under the existing umbrella tasks (#13/#14/#15) and are broken into per-item native tasks at the start of each milestone's execution (especially the M2.8–M2.13 split). Dependency edges: M1.9 ⟵ M1.1–M1.8; M5 (#16) ⟵ M1 (#12), already set. + +## Notes / risks carried forward + +- **M1.1 proto risk** (M1.0 resolves it) — if the site pull RPC is missing, M1.1 grows by two sub-tasks (proto + site handler). +- **M1.3 contract risk** — `PullAuditEventsResponse` likely does not carry SiteCall operational state; an additive contract extension or a dedicated SiteCall pull may be needed. Confirm at implement time; split if >300 LOC. +- **M3.1 fixture risk** — real compile may surface latent-invalid existing scripts (budgeted as M3.4). +- Keep `git diff` review before each commit; rebuild the image (`bash docker/deploy.sh`) for any M1 runtime change before claiming done. diff --git a/docs/plans/2026-06-15-stillpending-phase1-implementation.md.tasks.json b/docs/plans/2026-06-15-stillpending-phase1-implementation.md.tasks.json new file mode 100644 index 00000000..1283e8a5 --- /dev/null +++ b/docs/plans/2026-06-15-stillpending-phase1-implementation.md.tasks.json @@ -0,0 +1,19 @@ +{ + "planPath": "docs/plans/2026-06-15-stillpending-phase1-implementation.md", + "tasks": [ + {"id": 22, "subject": "M1.0: Confirm proto/site surface for audit pull (spike)", "status": "pending"}, + {"id": 23, "subject": "M1.1: Production gRPC IPullAuditEventsClient", "status": "pending", "blockedBy": [22]}, + {"id": 24, "subject": "M1.2: Wire reconciliation + purge actors as central singletons", "status": "pending"}, + {"id": 25, "subject": "M1.3: SiteCallAudit periodic reconciliation pull", "status": "pending"}, + {"id": 26, "subject": "M1.4: SiteCallAudit daily terminal-row purge scheduler", "status": "pending"}, + {"id": 27, "subject": "M1.5: SiteEventLog — emit Alarm events", "status": "pending"}, + {"id": 28, "subject": "M1.6: SiteEventLog — Deployment + Instance-lifecycle events", "status": "pending"}, + {"id": 29, "subject": "M1.7: SiteEventLog — Store-and-Forward + Notification events", "status": "pending"}, + {"id": 30, "subject": "M1.8: SiteEventLog — script started/completed (Info)", "status": "pending"}, + {"id": 31, "subject": "M1.9: M1 integration verification + redeploy", "status": "pending", "blockedBy": [23, 24, 25, 26, 27, 28, 29, 30]}, + {"id": 13, "subject": "M2 — Correctness & behavioral gaps (Tier 2) [umbrella; split per-item at execution]", "status": "pending"}, + {"id": 14, "subject": "M3 — Script trust boundary (Tier 1 #1-#2) [umbrella]", "status": "pending"}, + {"id": 15, "subject": "M4 — Doc reconciliation (Tier 4) [umbrella]", "status": "pending"} + ], + "lastUpdated": "2026-06-15" +}