PR #94 closed the Phase 6.2 dispatch wiring blocker. Update the dashboard: - Status line: "two of three release blockers remain". - Release-blocker #1 section struck through + marked CLOSED with PR link. Remaining Stream C surfaces (Browse / Subscribe / Alarm / Call + finer- grained scope resolution) downgraded to hardening follow-ups — not release-blocking. - Change log: new dated entry. Two remaining blockers: Phase 6.1 Stream D config-cache wiring (task #136) + Phase 6.3 Streams A/C/F redundancy runtime (tasks #145, #147, #150). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
105 lines
8.3 KiB
Markdown
105 lines
8.3 KiB
Markdown
# v2 Release Readiness
|
||
|
||
> **Last updated**: 2026-04-19 (release blocker #1 closed — Phase 6.2 dispatch wiring shipped)
|
||
> **Status**: **NOT YET RELEASE-READY** — two of three release blockers remain (Phase 6.1 Stream D config-cache wiring + Phase 6.3 Streams A/C/F redundancy runtime).
|
||
|
||
This doc is the single view of where v2 stands against its release criteria. Update it whenever a deferred follow-up closes or a new release blocker is discovered.
|
||
|
||
## Release-readiness dashboard
|
||
|
||
| Phase | Shipped | Status |
|
||
|---|---|---|
|
||
| Phase 0 — Rename + entry gate | ✓ | Shipped |
|
||
| Phase 1 — Configuration + Admin scaffold | ✓ | Shipped (some UI items deferred to 6.4) |
|
||
| Phase 2 — Galaxy driver split (Proxy/Host/Shared) | ✓ | Shipped |
|
||
| Phase 3 — OPC UA server + LDAP + security profiles | ✓ | Shipped |
|
||
| Phase 4 — Redundancy scaffold (entities + endpoints) | ✓ | Shipped (runtime closes in 6.3) |
|
||
| Phase 5 — Drivers | ⚠ partial | Galaxy / Modbus / S7 / OpcUaClient shipped; AB CIP / AB Legacy / TwinCAT / FOCAS deferred (task #120) |
|
||
| Phase 6.1 — Resilience & Observability | ✓ | **SHIPPED** (PRs #78–83) |
|
||
| Phase 6.2 — Authorization runtime | ◐ core | **SHIPPED (core)** (PRs #84–88); dispatch wiring + Admin UI deferred |
|
||
| Phase 6.3 — Redundancy runtime | ◐ core | **SHIPPED (core)** (PRs #89–90); coordinator + UA-node wiring + Admin UI + interop deferred |
|
||
| Phase 6.4 — Admin UI completion | ◐ data layer | **SHIPPED (data layer)** (PRs #91–92); Blazor UI + OPC 40010 address-space wiring deferred |
|
||
|
||
**Aggregate test counts:** 906 baseline (pre-Phase-6) → **1159 passing** across Phase 6. One pre-existing Client.CLI `SubscribeCommandTests.Execute_PrintsSubscriptionMessage` flake tracked separately.
|
||
|
||
## Release blockers (must close before v2 GA)
|
||
|
||
Ordered by severity + impact on production fitness.
|
||
|
||
### ~~Security — Phase 6.2 dispatch wiring~~ (task #143 — **CLOSED** 2026-04-19, PR #94)
|
||
|
||
**Closed**. `AuthorizationGate` + `NodeScopeResolver` now thread through `OpcUaApplicationHost → OtOpcUaServer → DriverNodeManager`. `OnReadValue` + `OnWriteValue` + all four HistoryRead paths call `gate.IsAllowed(identity, operation, scope)` before the invoker. Production deployments activate enforcement by constructing `OpcUaApplicationHost` with an `AuthorizationGate(StrictMode: true)` + populating the `NodeAcl` table.
|
||
|
||
Additional Stream C surfaces (not release-blocking, hardening only):
|
||
|
||
- Browse + TranslateBrowsePathsToNodeIds gating with ancestor-visibility logic per `acl-design.md` §Browse.
|
||
- CreateMonitoredItems + TransferSubscriptions gating with per-item `(AuthGenerationId, MembershipVersion)` stamp so revoked grants surface `BadUserAccessDenied` within one publish cycle (decision #153).
|
||
- Alarm Acknowledge / Confirm / Shelve gating.
|
||
- Call (method invocation) gating.
|
||
- Finer-grained scope resolution — current `NodeScopeResolver` returns a flat cluster-level scope. Joining against the live Configuration DB to populate UnsArea / UnsLine / Equipment path is tracked as Stream C.12.
|
||
- 3-user integration matrix covering every operation × allow/deny.
|
||
|
||
These are additional hardening — the three highest-value surfaces (Read / Write / HistoryRead) are now gated, which covers the base-security gap for v2 GA.
|
||
|
||
### Config fallback — Phase 6.1 Stream D wiring (task #136)
|
||
|
||
`ResilientConfigReader` + `GenerationSealedCache` + `StaleConfigFlag` all exist but nothing consumes them. The `NodeBootstrap` path still uses the original single-file `LiteDbConfigCache` via `ILocalConfigCache`; `sp_PublishGeneration` doesn't call `GenerationSealedCache.SealAsync` after commit; the Configuration read services don't wrap queries in `ResilientConfigReader.ReadAsync`.
|
||
|
||
Closing this requires:
|
||
|
||
- `sp_PublishGeneration` (or its EF-side wrapper) calls `SealAsync` after successful commit.
|
||
- DriverInstance enumeration, LdapGroupRoleMapping fetches, cluster + namespace metadata reads route through `ResilientConfigReader.ReadAsync`.
|
||
- Integration test: SQL container kill mid-operation → serves sealed snapshot, `UsingStaleConfig` = true, driver stays Healthy, `/healthz` body reflects the flag.
|
||
|
||
### Redundancy — Phase 6.3 Streams A/C/F (tasks #145, #147, #150)
|
||
|
||
`ServiceLevelCalculator` + `RecoveryStateManager` + `ApplyLeaseRegistry` exist as pure logic. **No code invokes them at runtime.** The OPC UA server still publishes a static `ServiceLevel`; `ServerUriArray` still carries only self; no coordinator reads cluster topology; no peer probing.
|
||
|
||
Closing this requires:
|
||
|
||
- `RedundancyCoordinator` singleton reads `ClusterNode` + peer list at startup (Stream A).
|
||
- `PeerHttpProbeLoop` + `PeerUaProbeLoop` feed the calculator.
|
||
- OPC UA node wiring: `ServiceLevel` becomes a live `BaseDataVariable` on calculator observer output; `ServerUriArray` includes self + peers; `RedundancySupport` static from `RedundancyMode` (Stream C).
|
||
- `sp_PublishGeneration` wraps in `await using var lease = coordinator.BeginApplyLease(...)` so the `PrimaryMidApply` band fires during actual publishes.
|
||
- Client interop matrix validation against Ignition / Kepware / Aveva OI Gateway (Stream F).
|
||
|
||
### Remaining drivers (task #120)
|
||
|
||
AB CIP, AB Legacy, TwinCAT ADS, FOCAS drivers are planned but unshipped. Decision pending on whether these are release-blocking for v2 GA or can slip to a v2.1 follow-up.
|
||
|
||
## Nice-to-haves (not release-blocking)
|
||
|
||
- **Admin UI** — Phase 6.1 Stream E.2/E.3 (`/hosts` column refresh), Phase 6.2 Stream D (`RoleGrantsTab` + `AclsTab` Probe), Phase 6.3 Stream E (`RedundancyTab`), Phase 6.4 Streams A/B UI pieces, Stream C DiffViewer, Stream D `IdentificationFields.razor`. Tasks #134, #144, #149, #153, #155, #156, #157.
|
||
- **Background services** — Phase 6.1 Stream B.4 `ScheduledRecycleScheduler` HostedService (task #137), Phase 6.1 Stream A analyzer (task #135 — Roslyn analyzer asserting every capability surface routes through `CapabilityInvoker`).
|
||
- **Multi-host dispatch** — Phase 6.1 Stream A follow-up (task #135). Currently every driver gets a single pipeline keyed on `driver.DriverInstanceId`; multi-host drivers (Modbus with N PLCs) need per-PLC host resolution so failing PLCs trip per-PLC breakers without poisoning siblings. Decision #144 requires this but we haven't wired it yet.
|
||
|
||
## Running the release-readiness check
|
||
|
||
```bash
|
||
pwsh ./scripts/compliance/phase-6-all.ps1
|
||
```
|
||
|
||
This meta-runner invokes each `phase-6-N-compliance.ps1` script in sequence and reports an aggregate PASS/FAIL. It is the single-command verification that what we claim is shipped still compiles + tests pass + the plan-level invariants are still satisfied.
|
||
|
||
Exit 0 = every phase passes its compliance checks + no test-count regression.
|
||
|
||
## Release-readiness exit criteria
|
||
|
||
v2 GA requires all of the following:
|
||
|
||
- [ ] All four Phase 6.N compliance scripts exit 0.
|
||
- [ ] `dotnet test ZB.MOM.WW.OtOpcUa.slnx` passes with ≤ 1 known-flake failure.
|
||
- [ ] Release blockers listed above all closed (or consciously deferred to v2.1 with a written decision).
|
||
- [ ] Production deployment checklist (separate doc) signed off by Fleet Admin.
|
||
- [ ] At least one end-to-end integration run against the live Galaxy on the dev box succeeds.
|
||
- [ ] OPC UA conformance test (CTT or UA Compliance Test Tool) passes against the live endpoint.
|
||
- [ ] Non-transparent redundancy cutover validated with at least one production client (Ignition 8.3 recommended — see decision #85).
|
||
|
||
## Change log
|
||
|
||
- **2026-04-19** — Release blocker #1 **closed** (PR #94). `AuthorizationGate` wired into `DriverNodeManager` Read / Write / HistoryRead dispatch. Remaining Stream C surfaces (Browse / Subscribe / Alarm / Call + finer-grained scope resolution) downgraded to hardening follow-ups — no longer release-blocking.
|
||
- **2026-04-19** — Phase 6.4 data layer merged (PRs #91–92). Phase 6 core complete. Capstone doc created.
|
||
- **2026-04-19** — Phase 6.3 core merged (PRs #89–90). `ServiceLevelCalculator` + `RecoveryStateManager` + `ApplyLeaseRegistry` land as pure logic; coordinator / UA-node wiring / Admin UI / interop deferred.
|
||
- **2026-04-19** — Phase 6.2 core merged (PRs #84–88). `AuthorizationGate` + `TriePermissionEvaluator` + `LdapGroupRoleMapping` land; dispatch wiring + Admin UI deferred.
|
||
- **2026-04-19** — Phase 6.1 shipped (PRs #78–83). Polly resilience + Tier A/B/C stability + health endpoints + LiteDB generation-sealed cache + Admin `/hosts` data layer all live.
|