Files
scadalink-design/docs/plans/phase-3c-deployment-store-forward.md
Joseph Doherty d91aa83665 refactor(docs): move requirements and test infra docs into docs/ subdirectories
Organize documentation by moving requirements (HighLevelReqs, Component-*,
lmxproxy_protocol) to docs/requirements/ and test infrastructure docs to
docs/test_infra/. Updates all cross-references in README, CLAUDE.md,
infra/README, component docs, and 23 plan files.
2026-03-21 01:11:35 -04:00

717 lines
48 KiB
Markdown

# Phase 3C: Deployment Pipeline & Store-and-Forward
**Date**: 2026-03-16
**Status**: Draft
**Prerequisites**: Phase 2 (Template Engine, deployment package contract), Phase 3A (Cluster Infrastructure, Site Runtime skeleton, local SQLite persistence), Phase 3B (Communication Layer, Site Runtime full actor hierarchy, Health Monitoring)
---
## Scope
**Goal**: Complete the deploy-to-site pipeline end-to-end with resilience.
**Components**:
- **Deployment Manager** (full) — Central-side deployment orchestration, instance lifecycle, system-wide artifact deployment
- **Store-and-Forward Engine** (full) — Site-side message buffering, retry, parking, replication, parked message management
**Testable Outcome**: Central validates, flattens, and deploys an instance to a site. Site compiles scripts, creates actors, reports success. Deployment ID ensures idempotency. Per-instance operation lock works. Instance lifecycle commands (disable, enable, delete) work. Store-and-forward buffers messages on transient failure, retries, parks. Async replication to standby. Parked messages queryable from central.
---
## Prerequisites
| Prerequisite | Phase | What Must Be Complete |
|---|---|---|
| Template Engine | 2 | Flattening, validation pipeline, revision hash generation, diff calculation, deployment package contract |
| Configuration Database | 1, 2 | Schema, repositories (IDeploymentManagerRepository), IAuditService, optimistic concurrency support |
| Cluster Infrastructure | 3A | Akka.NET cluster with SBR, failover, CoordinatedShutdown |
| Site Runtime | 3A, 3B | Deployment Manager singleton, Instance Actor hierarchy, script compilation, alarm actors, full actor lifecycle |
| Communication Layer | 3B | All 8 message patterns (deployment, lifecycle, artifact deployment, remote queries), correlation IDs, timeouts |
| Health Monitoring | 3B | Metric collection framework (S&F buffer depth will be added as a new metric) |
| Site Event Logging | 3B | Event recording to SQLite (S&F activity events will be added) |
| Security & Auth | 1 | Deployment role with optional site scoping |
---
## Requirements Checklist
Each bullet is extracted from the referenced docs/requirements/HighLevelReqs.md sections. Items marked with a phase note indicate split-section bullets owned by another phase.
### Section 1.3 — Store-and-Forward Persistence (Site Clusters Only)
- `[1.3-1]` Store-and-forward applies only at site clusters — central does not buffer messages.
- `[1.3-2]` All site-level S&F buffers (external system calls, notifications, cached database writes) are replicated between the two site cluster nodes using application-level replication over Akka.NET remoting.
- `[1.3-3]` Active node persists buffered messages to a local SQLite database and forwards them to the standby node, which maintains its own local SQLite copy.
- `[1.3-4]` On failover, the standby node already has a replicated copy and takes over delivery seamlessly.
- `[1.3-5]` Successfully delivered messages are removed from both nodes' local stores.
- `[1.3-6]` There is no maximum buffer size — messages accumulate until they either succeed or exhaust retries and are parked.
- `[1.3-7]` Retry intervals are fixed (not exponential backoff).
### Section 1.4 — Deployment Behavior
- `[1.4-1]` When central deploys a new configuration to a site instance, the site applies it immediately upon receipt — no local operator confirmation required. *(Phase 3C)*
- `[1.4-2]` If a site loses connectivity to central, it continues operating with its last received deployed configuration. *(Phase 3C — verified via resilience tests)*
- `[1.4-3]` The site reports back to central whether deployment was successfully applied. *(Phase 3C)*
- `[1.4-4]` Pre-deployment validation: before any deployment is sent to a site, the central cluster performs comprehensive validation including flattening, test-compiling scripts, verifying alarm trigger references, verifying script trigger references, and checking data connection binding completeness. *(Phase 3C — orchestration; validation pipeline built in Phase 2)*
**Split-section note**: Section 1.4 is fully covered by Phase 3C (backend pipeline). Phase 6 covers the UI for deployment workflows (diff view, deploy button, status tracking display).
### Section 1.5 — System-Wide Artifact Deployment
- `[1.5-1]` Changes to shared scripts, external system definitions, database connection definitions, and notification lists are not automatically propagated to sites.
- `[1.5-2]` Deployment of system-wide artifacts requires explicit action by a user with the Deployment role.
- `[1.5-3]` The Design role manages the definitions; the Deployment role triggers deployment to sites. A user may hold both roles.
**Split-section note**: Phase 3C covers the backend pipeline for artifact deployment. Phase 6 covers the UI for triggering and monitoring artifact deployment.
### Section 3.8.1 — Instance Lifecycle (Phase 3C portion)
- `[3.8.1-1]` Instances can be in one of two states: enabled or disabled.
- `[3.8.1-2]` Enabled: instance is active — data subscriptions, script triggers, and alarm evaluation are all running.
- `[3.8.1-3]` Disabled: site stops script triggers, data subscriptions (no live data collection), and alarm evaluation. Deployed configuration is retained so instance can be re-enabled without redeployment.
- `[3.8.1-4]` Disabled: store-and-forward messages for a disabled instance continue to drain (deliver pending messages).
- `[3.8.1-5]` Deletion removes the running configuration from the site, stops subscriptions, destroys the Instance Actor and its children.
- `[3.8.1-6]` Store-and-forward messages are not cleared on deletion — they continue to be delivered or can be managed via parked message management.
- `[3.8.1-7]` If the site is unreachable when a delete is triggered, the deletion fails. Central does not mark it as deleted until the site confirms.
- `[3.8.1-8]` Templates cannot be deleted if any instances or child templates reference them.
**Split-section note**: Phase 3C covers the backend for lifecycle commands. Phase 4 covers the UI for disable/enable/delete actions.
### Section 3.9 — Template Deployment & Change Propagation (Phase 3C portion)
- `[3.9-1]` Template changes are not automatically propagated to deployed instances.
- `[3.9-2]` The system maintains two views: deployed configuration (currently running) and template-derived configuration (what it would look like if deployed now).
- `[3.9-3]` Deployment is performed at the individual instance level — an engineer explicitly commands the system to update a specific instance.
- `[3.9-4]` The system must show differences between deployed and template-derived configuration.
- `[3.9-5]` No rollback support required. Only tracks current deployed state, not history.
- `[3.9-6]` Concurrent editing uses last-write-wins model. No pessimistic locking or optimistic concurrency conflict detection on templates.
**Split-section note**: Phase 3C covers `[3.9-1]`, `[3.9-2]` (backend maintenance of two views), `[3.9-3]` (backend deployment pipeline), `[3.9-5]` (no rollback), `[3.9-6]` (last-write-wins — already from Phase 2). Phase 6 covers `[3.9-4]` (diff view UI) and the deployment trigger UI. The diff calculation itself is built in Phase 2; Phase 3C uses it. Phase 3C stores the deployed configuration snapshot that enables diff comparison.
### Section 5.3 — Store-and-Forward for External Calls (Phase 3C portion: engine)
- `[5.3-1]` If an external system is unavailable when a script invokes a method, the message is buffered locally at the site.
- `[5.3-2]` Retry is performed per message — individual failed messages retry independently.
- `[5.3-3]` Each external system definition includes configurable retry settings: max retry count and time between retries (fixed interval, no exponential backoff).
- `[5.3-4]` After max retries are exhausted, the message is parked (dead-lettered) for manual review.
- `[5.3-5]` There is no maximum buffer size — messages accumulate until delivery succeeds or retries exhausted.
**Split-section note**: Phase 3C builds the S&F engine that handles buffering, retry, and parking. Phase 7 integrates the External System Gateway as a delivery target and implements the error classification (transient vs. permanent).
### Section 5.4 — Parked Message Management (Phase 3C portion: backend)
- `[5.4-1]` Parked messages are stored at the site where they originated.
- `[5.4-2]` Central UI can query sites for parked messages and manage them remotely.
- `[5.4-3]` Operators can retry or discard parked messages from the central UI.
- `[5.4-4]` Parked message management covers external system calls, notifications, and cached database writes.
**Split-section note**: Phase 3C builds the site-side storage, query handler, and retry/discard command handler for parked messages. Phase 6 builds the central UI for parked message management.
### Section 6.4 — Store-and-Forward for Notifications (Phase 3C portion: engine)
- `[6.4-1]` If the email server is unavailable, notifications are buffered locally at the site.
- `[6.4-2]` Follows the same retry pattern as external system calls: configurable max retry count and time between retries (fixed interval).
- `[6.4-3]` After max retries are exhausted, the notification is parked for manual review.
- `[6.4-4]` There is no maximum buffer size for notification messages.
**Split-section note**: Phase 3C builds the S&F engine generically to support all three message categories. Phase 7 integrates the Notification Service as a delivery target.
---
## Design Constraints Checklist
Constraints from CLAUDE.md Key Design Decisions and Component-*.md documents relevant to this phase.
### KDD Constraints
- `[KDD-deploy-6]` Deployment identity: unique deployment ID + revision hash for idempotency.
- `[KDD-deploy-7]` Per-instance operation lock covers all mutating commands (deploy, disable, enable, delete).
- `[KDD-deploy-8]` Site-side apply is all-or-nothing per instance.
- `[KDD-deploy-9]` System-wide artifact version skew across sites is supported.
- `[KDD-deploy-11]` Optimistic concurrency on deployment status records.
- `[KDD-sf-1]` Fixed retry interval, no max buffer size. Only transient failures buffered.
- `[KDD-sf-2]` Async best-effort replication to standby (no ack wait).
- `[KDD-sf-3]` Messages not cleared on instance deletion.
- `[KDD-sf-4]` CachedCall idempotency is the caller's responsibility. *(Documented in Phase 3C; enforced in Phase 7 integration.)*
### Component Design Constraints (from docs/requirements/Component-DeploymentManager.md)
- `[CD-DM-1]` Deployment flow: validate -> flatten -> send -> track. Validation failures stop the pipeline before anything is sent.
- `[CD-DM-2]` Site-side idempotency on deployment ID — duplicate deployment receives "already applied" response.
- `[CD-DM-3]` Sites reject stale configurations — older revision hash than currently applied is rejected.
- `[CD-DM-4]` After central failover or timeout, Deployment Manager queries the site for current deployment state before allowing re-deploy.
- `[CD-DM-5]` Only one mutating operation per instance in-flight at a time. Second operation rejected with "operation in progress" error.
- `[CD-DM-6]` Different instances can proceed in parallel, even at the same site.
- `[CD-DM-7]` State transition matrix: Enabled allows deploy/disable/delete; Disabled allows deploy(enables on apply)/enable/delete; Not-deployed allows deploy only.
- `[CD-DM-8]` System-wide artifact deployment shows per-site result matrix. Successful sites not rolled back if others fail. Failed sites can be retried individually.
- `[CD-DM-9]` Only current deployment status per instance stored (pending, in-progress, success, failed). No deployment history table — audit log captures history.
- `[CD-DM-10]` Deployment scope is individual instance level. Bulk operations decompose into individual instance deployments.
- `[CD-DM-11]` Diff view available before deploying (added/removed/changed members, connection binding changes). *(Diff calculation from Phase 2; orchestration in Phase 3C.)*
- `[CD-DM-12]` Two views maintained: deployed configuration and template-derived configuration.
- `[CD-DM-13]` Deployable artifacts include flattened instance config plus system-wide artifacts (shared scripts, external system defs, DB connection defs, notification lists). System-wide artifact deployment is a separate action.
- `[CD-DM-14]` Site-side apply is all-or-nothing per instance. If any step fails (e.g., script compilation), entire deployment rejected. Previous config remains active and unchanged.
- `[CD-DM-15]` Cross-site version skew for artifacts is supported. Artifacts are self-contained and site-independent.
- `[CD-DM-16]` Disable: stops data subscriptions, script triggers, alarm evaluation. Config retained.
- `[CD-DM-17]` Enable: re-activates a disabled instance.
- `[CD-DM-18]` Delete: removes running config, destroys Instance Actor and children. S&F messages not cleared. Fails if site unreachable — central does not mark deleted until site confirms.
### Component Design Constraints (from docs/requirements/Component-StoreAndForward.md)
- `[CD-SF-1]` Three message categories: external system calls, email notifications, cached database writes.
- `[CD-SF-2]` Retry settings defined on the source entity (external system def, SMTP config, DB connection def), not per-message.
- `[CD-SF-3]` Only transient failures eligible for S&F buffering. Permanent failures (HTTP 4xx) returned to script, not queued.
- `[CD-SF-4]` No maximum buffer size. Bounded only by available disk space.
- `[CD-SF-5]` Active node persists locally and forwards each buffer operation (add, remove, park) to standby asynchronously. No ack wait.
- `[CD-SF-6]` Standby applies operations to its own local SQLite.
- `[CD-SF-7]` On failover, rare cases of duplicate deliveries (delivered but remove not replicated) or missed retries (added but not replicated). Both acceptable.
- `[CD-SF-8]` Parked messages remain in SQLite at site. Central queries via Communication Layer.
- `[CD-SF-9]` Operators can retry (move back to retry queue) or discard (remove permanently) parked messages.
- `[CD-SF-10]` Messages not automatically cleared when instance deleted. Pending and parked messages continue to exist.
- `[CD-SF-11]` Message format stores: message ID, category, target, payload, retry count, created at, last attempt at, status (pending/retrying/parked).
- `[CD-SF-12]` Message lifecycle: attempt immediate delivery -> success removes; failure buffers -> retry loop -> success removes + notify standby; max retries exhausted -> park.
### Component Design Constraints (from docs/requirements/Component-SiteRuntime.md — deployment-related)
- `[CD-SR-1]` Deployment handling: receive config -> store in SQLite -> compile scripts -> create/update Instance Actor -> report result.
- `[CD-SR-2]` For redeployments: existing Instance Actor and children stopped, then new Instance Actor created with updated config. Subscriptions re-established.
- `[CD-SR-3]` Disable: stops Instance Actor and children. Retains deployed config in SQLite for re-enablement.
- `[CD-SR-4]` Enable: creates new Instance Actor from stored config (same as startup).
- `[CD-SR-5]` Delete: stops Instance Actor and children, removes deployed config from SQLite. Does not clear S&F messages.
- `[CD-SR-6]` Script compilation failure during deployment rejects entire deployment. No partial state applied. Failure reported to central.
### Component Design Constraints (from docs/requirements/Component-Communication.md — deployment-related)
- `[CD-COM-1]` Deployment pattern: request/response. No buffering at central. Unreachable site = immediate failure.
- `[CD-COM-2]` Instance lifecycle pattern: request/response. Unreachable site = immediate failure.
- `[CD-COM-3]` System-wide artifact pattern: broadcast with per-site acknowledgment.
- `[CD-COM-4]` Deployment timeout: 120 seconds default (script compilation can be slow).
- `[CD-COM-5]` Lifecycle command timeout: 30 seconds.
- `[CD-COM-6]` System-wide artifact timeout: 120 seconds per site.
- `[CD-COM-7]` Application-level correlation: deployments include deployment ID + revision hash; lifecycle commands include command ID.
- `[CD-COM-8]` Remote query pattern for parked messages: request/response with query ID, 30-second timeout.
---
## Work Packages
### WP-1: Deployment Manager — Core Deployment Flow
**Description**: Implement the central-side deployment orchestration pipeline: accept deployment request, call Template Engine for validated+flattened config, send to site via Communication Layer, track status.
**Acceptance Criteria**:
- Deployment request triggers validation -> flatten -> send -> track flow `[CD-DM-1]`
- Validation failures stop the pipeline before sending; errors returned to caller `[CD-DM-1]`, `[1.4-4]`
- Pre-deployment validation invokes Template Engine for flattening, naming collision detection, script compilation, trigger references, connection binding `[1.4-4]`
- Validation does not verify that data source relative paths resolve to real tags on physical devices (runtime concern) `[1.4-4]`
- Successful deployment sends flattened config to site via Communication Layer `[1.4-1]`
- Site applies immediately upon receipt — no operator confirmation `[1.4-1]`
- Site reports success/failure back to central `[1.4-3]`
- Deployment status updated in config DB (pending -> in-progress -> success/failed) `[CD-DM-9]`
- Deployment scope is individual instance level `[CD-DM-10]`, `[3.9-3]`
- Template changes not auto-propagated — explicit deploy required `[3.9-1]`
- No rollback support — only current deployed state tracked `[3.9-5]`
- Uses 120-second deployment timeout `[CD-COM-4]`
- If site unreachable, deployment fails immediately (no central buffering) `[CD-COM-1]`
**Estimated Complexity**: L
**Requirements Traced**: `[1.4-1]`, `[1.4-3]`, `[1.4-4]`, `[3.9-1]`, `[3.9-3]`, `[3.9-5]`, `[CD-DM-1]`, `[CD-DM-9]`, `[CD-DM-10]`, `[CD-COM-1]`, `[CD-COM-4]`
---
### WP-2: Deployment Identity & Idempotency
**Description**: Implement deployment ID generation, revision hash propagation, and idempotent site-side apply.
**Acceptance Criteria**:
- Every deployment assigned a unique deployment ID `[KDD-deploy-6]`
- Deployment includes flattened config's revision hash (from Template Engine) `[KDD-deploy-6]`
- Site-side apply is idempotent on deployment ID — duplicate deployment returns "already applied" `[CD-DM-2]`
- Sites reject stale configurations — older revision hash than currently applied is rejected, site reports current version `[CD-DM-3]`
- After central failover or timeout, Deployment Manager queries site for current deployment state before allowing re-deploy `[CD-DM-4]`
- Deployment messages include deployment ID + revision hash as correlation `[CD-COM-7]`
**Estimated Complexity**: M
**Requirements Traced**: `[KDD-deploy-6]`, `[CD-DM-2]`, `[CD-DM-3]`, `[CD-DM-4]`, `[CD-COM-7]`
---
### WP-3: Per-Instance Operation Lock
**Description**: Implement concurrency control ensuring only one mutating operation per instance can be in-flight at a time.
**Acceptance Criteria**:
- Only one mutating operation (deploy, disable, enable, delete) per instance in-flight at a time `[KDD-deploy-7]`, `[CD-DM-5]`
- Second operation on same instance rejected with "operation in progress" error `[CD-DM-5]`
- Different instances can proceed in parallel, even at the same site `[CD-DM-6]`
- Lock released when operation completes (success or failure) or times out
- Lock state does not survive central failover (in-progress operations treated as failed per `[CD-DM-4]`)
**Estimated Complexity**: M
**Requirements Traced**: `[KDD-deploy-7]`, `[CD-DM-5]`, `[CD-DM-6]`
---
### WP-4: State Transition Matrix & Deployment Status
**Description**: Implement the allowed state transitions for instance operations and deployment status persistence with optimistic concurrency.
**Acceptance Criteria**:
- State transition matrix enforced: `[CD-DM-7]`
- Enabled: allows deploy, disable, delete. Rejects enable (already enabled).
- Disabled: allows deploy (enables on apply), enable, delete. Rejects disable (already disabled).
- Not-deployed: allows deploy only. Rejects disable, enable, delete.
- Invalid state transitions return clear error messages
- Only current deployment status per instance stored (pending, in-progress, success, failed) `[CD-DM-9]`
- No deployment history table — audit log captures history via IAuditService `[CD-DM-9]`
- Optimistic concurrency on deployment status records `[KDD-deploy-11]`
- All deployment actions logged via IAuditService (who, what, when, result)
**Estimated Complexity**: M
**Requirements Traced**: `[CD-DM-7]`, `[CD-DM-9]`, `[KDD-deploy-11]`, `[3.8.1-1]`, `[3.8.1-2]`
---
### WP-5: Site-Side Apply Atomicity
**Description**: Implement all-or-nothing deployment application at the site.
**Acceptance Criteria**:
- Site stores new config, compiles all scripts, creates/updates Instance Actor as single operation `[KDD-deploy-8]`, `[CD-DM-14]`
- If any step fails (e.g., script compilation), entire deployment for that instance rejected `[CD-DM-14]`, `[CD-SR-6]`
- Previous configuration remains active and unchanged on failure `[CD-DM-14]`
- Site reports specific failure reason (e.g., compilation error details) back to central `[CD-SR-6]`
- For redeployments: existing Instance Actor and children stopped, then new Instance Actor created with updated config `[CD-SR-2]`
- Subscriptions re-established after redeployment `[CD-SR-2]`
- Site continues operating with last deployed config if connectivity to central lost `[1.4-2]`
- Deployment handling follows: receive -> store SQLite -> compile -> create/update actor -> report `[CD-SR-1]`
**Estimated Complexity**: L
**Requirements Traced**: `[KDD-deploy-8]`, `[CD-DM-14]`, `[CD-SR-1]`, `[CD-SR-2]`, `[CD-SR-6]`, `[1.4-2]`
---
### WP-6: Instance Lifecycle Commands
**Description**: Implement disable, enable, and delete commands sent from central to site.
**Acceptance Criteria**:
- **Disable**: site stops script triggers, data subscriptions, and alarm evaluation `[3.8.1-3]`, `[CD-DM-16]`
- Disable retains deployed configuration for re-enablement without redeployment `[3.8.1-3]`, `[CD-DM-16]`, `[CD-SR-3]`
- Disable: S&F messages for disabled instance continue to drain `[3.8.1-4]`
- **Enable**: re-activates a disabled instance by creating a new Instance Actor from stored config, restoring data subscriptions, script triggers, and alarm evaluation `[CD-DM-17]`, `[CD-SR-4]`
- Disable and enable commands fail immediately if the site is unreachable (no buffering, consistent with deployment behavior) `[CD-COM-2]`
- **Delete**: removes running config from site, stops subscriptions, destroys Instance Actor and children `[3.8.1-5]`, `[CD-DM-18]`, `[CD-SR-5]`
- Delete: S&F messages are not cleared `[3.8.1-6]`, `[CD-DM-18]`, `[CD-SR-5]`, `[KDD-sf-3]`
- Delete fails if site unreachable — central does not mark deleted until site confirms `[3.8.1-7]`, `[CD-DM-18]`
- Templates cannot be deleted if instances or child templates reference them `[3.8.1-8]`
- Lifecycle commands use request/response pattern with 30s timeout `[CD-COM-2]`, `[CD-COM-5]`
- Lifecycle commands include command ID for deduplication (duplicate commands recognized and not re-applied) `[CD-COM-7]`
**Estimated Complexity**: L
**Requirements Traced**: `[3.8.1-1]` through `[3.8.1-8]`, `[KDD-sf-3]`, `[CD-DM-16]`, `[CD-DM-17]`, `[CD-DM-18]`, `[CD-SR-3]`, `[CD-SR-4]`, `[CD-SR-5]`, `[CD-COM-2]`, `[CD-COM-5]`, `[CD-COM-7]`
---
### WP-7: System-Wide Artifact Deployment
**Description**: Implement deployment of shared scripts, external system definitions, database connection definitions, and notification lists to all sites.
**Acceptance Criteria**:
- Changes not automatically propagated to sites `[1.5-1]`
- Deployment requires explicit action by a user with Deployment role `[1.5-2]`
- Design role manages definitions; Deployment role triggers deployment `[1.5-3]`
- Broadcast pattern with per-site acknowledgment `[CD-COM-3]`
- Per-site result matrix — each site reports independently `[CD-DM-8]`
- Successful sites not rolled back if other sites fail `[CD-DM-8]`
- Failed sites can be retried individually `[CD-DM-8]`
- 120-second timeout per site `[CD-COM-6]`
- Cross-site version skew supported — sites can run different artifact versions `[KDD-deploy-9]`, `[CD-DM-15]`
- Artifacts are self-contained and site-independent `[CD-DM-15]`
- System-wide artifact deployment is a separate action from instance deployment `[CD-DM-13]`
- Shared scripts undergo pre-compilation validation (syntax/structural correctness) before deployment to sites
- All artifact deployment actions logged via IAuditService
**Estimated Complexity**: L
**Requirements Traced**: `[1.5-1]`, `[1.5-2]`, `[1.5-3]`, `[KDD-deploy-9]`, `[CD-DM-8]`, `[CD-DM-13]`, `[CD-DM-15]`, `[CD-COM-3]`, `[CD-COM-6]`
---
### WP-8: Deployed vs. Template-Derived State Management
**Description**: Implement storage and retrieval of deployed configuration snapshots, enabling comparison with template-derived configs.
**Acceptance Criteria**:
- System maintains two views per instance: deployed configuration and template-derived configuration `[3.9-2]`, `[CD-DM-12]`
- Deployed configuration updated on successful deployment `[CD-DM-12]`
- Template-derived configuration computed on demand from current template state (uses Phase 2 flattening)
- Diff can be computed between deployed and template-derived (uses Phase 2 diff calculation) `[CD-DM-11]`
- Diff shows added/removed/changed members and connection binding changes `[CD-DM-11]`
- Staleness detectable via revision hash comparison `[3.9-4]`
**Estimated Complexity**: M
**Requirements Traced**: `[3.9-2]`, `[3.9-4]`, `[CD-DM-11]`, `[CD-DM-12]`
---
### WP-9: S&F SQLite Persistence & Message Format
**Description**: Implement the SQLite schema and data access layer for store-and-forward message buffering at site nodes.
**Acceptance Criteria**:
- Buffered messages persisted to local SQLite on each site node `[1.3-3]`
- Message format stores: message ID, category, target, payload, retry count, created at, last attempt at, status (pending/retrying/parked) `[CD-SF-11]`
- Three message categories supported: external system calls, email notifications, cached database writes `[CD-SF-1]`
- No maximum buffer size — messages accumulate until delivery or parking `[1.3-6]`, `[CD-SF-4]`
- Central does not buffer messages (S&F is site-only) `[1.3-1]`
- All S&F timestamps are UTC
**Estimated Complexity**: M
**Requirements Traced**: `[1.3-1]`, `[1.3-3]`, `[1.3-6]`, `[CD-SF-1]`, `[CD-SF-4]`, `[CD-SF-11]`
---
### WP-10: S&F Retry Engine
**Description**: Implement the fixed-interval retry loop with per-source-entity retry settings and transient-only buffering.
**Acceptance Criteria**:
- Message lifecycle: attempt immediate delivery -> failure buffers -> retry loop -> success removes; max retries -> park `[CD-SF-12]`
- Retry is per-message — individual messages retry independently `[5.3-2]`
- Fixed retry interval (not exponential backoff) `[1.3-7]`, `[KDD-sf-1]`
- Retry settings defined on the source entity (external system def, SMTP config, DB connection def), not per-message `[CD-SF-2]`
- External system definitions include max retry count and time between retries `[5.3-3]`
- Notification config includes max retry count and time between retries `[6.4-2]`
- After max retries exhausted, message is parked (dead-lettered) `[5.3-4]`, `[6.4-3]`
- Only transient failures eligible for buffering. Permanent failures returned to caller, not queued `[KDD-sf-1]`, `[CD-SF-3]`
- No maximum buffer size `[5.3-5]`, `[6.4-4]`, `[KDD-sf-1]`
- Messages for external calls buffered locally when system unavailable `[5.3-1]`
- Notifications buffered when email server unavailable `[6.4-1]`
- Successfully delivered messages removed from local store `[1.3-5]`
**Estimated Complexity**: L
**Requirements Traced**: `[1.3-5]`, `[1.3-7]`, `[5.3-1]` through `[5.3-5]`, `[6.4-1]` through `[6.4-4]`, `[KDD-sf-1]`, `[CD-SF-2]`, `[CD-SF-3]`, `[CD-SF-12]`
---
### WP-11: S&F Async Replication to Standby
**Description**: Implement application-level replication of buffer operations from active to standby node.
**Acceptance Criteria**:
- All S&F buffers replicated between two site cluster nodes via application-level replication over Akka.NET remoting `[1.3-2]`
- Active node forwards each buffer operation (add, remove, park) to standby asynchronously `[CD-SF-5]`, `[KDD-sf-2]`
- Active node does not wait for standby acknowledgment (no ack wait) `[KDD-sf-2]`, `[CD-SF-5]`
- Standby applies operations to its own local SQLite `[CD-SF-6]`
- On failover, standby takes over delivery from its replicated copy `[1.3-4]`. Note: per `[CD-SF-7]`, the async replication design means the copy is near-complete — rare duplicate deliveries or missed retries are acceptable trade-offs for the latency benefit.
- Duplicate deliveries and missed retries accepted as trade-offs for async replication `[CD-SF-7]`
- Successfully delivered messages removed from both nodes' stores `[1.3-5]`
**Estimated Complexity**: L
**Requirements Traced**: `[1.3-2]`, `[1.3-4]`, `[1.3-5]`, `[KDD-sf-2]`, `[CD-SF-5]`, `[CD-SF-6]`, `[CD-SF-7]`
---
### WP-12: Parked Message Management
**Description**: Implement site-side parked message storage, query handling, and retry/discard commands accessible from central.
**Acceptance Criteria**:
- Parked messages stored at the site in SQLite `[5.4-1]`, `[CD-SF-8]`
- Central can query sites for parked messages via Communication Layer `[5.4-2]`, `[CD-SF-8]`
- Operators can retry a parked message (moves back to retry queue) `[5.4-3]`, `[CD-SF-9]`
- Operators can discard a parked message (removes permanently) `[5.4-3]`, `[CD-SF-9]`
- Management covers all three categories: external system calls, notifications, cached database writes `[5.4-4]`
- Remote query uses request/response pattern with query ID, 30s timeout `[CD-COM-8]`
- Messages not automatically cleared when instance deleted `[CD-SF-10]`, `[KDD-sf-3]`, `[3.8.1-6]`
- Pending and parked messages continue to exist after instance deletion `[CD-SF-10]`
**Estimated Complexity**: M
**Requirements Traced**: `[5.4-1]` through `[5.4-4]`, `[KDD-sf-3]`, `[CD-SF-8]`, `[CD-SF-9]`, `[CD-SF-10]`, `[CD-COM-8]`, `[3.8.1-6]`
---
### WP-13: S&F Messages Survive Instance Deletion
**Description**: Ensure store-and-forward messages are preserved when an instance is deleted.
**Acceptance Criteria**:
- S&F messages not cleared on instance deletion `[3.8.1-6]`, `[KDD-sf-3]`, `[CD-SF-10]`
- Pending messages continue retry delivery after instance deletion
- Parked messages remain queryable and manageable from central after instance deletion
- S&F messages for disabled instances continue to drain `[3.8.1-4]`
**Estimated Complexity**: S
**Requirements Traced**: `[3.8.1-4]`, `[3.8.1-6]`, `[KDD-sf-3]`, `[CD-SF-10]`
---
### WP-14: S&F Health Metrics & Event Logging Integration
**Description**: Integrate S&F buffer depth as a health metric and log S&F activity to site event log.
**Acceptance Criteria**:
- S&F buffer depth reported as health metric (broken down by category) — integrates with Phase 3B Health Monitoring
- S&F activity logged to site event log: message queued, delivered, retried, parked (per docs/requirements/Component-StoreAndForward.md Dependencies)
- S&F buffer depth visible in health reports sent to central
**Estimated Complexity**: S
**Requirements Traced**: `[CD-SF-1]` (categories), docs/requirements/Component-StoreAndForward.md Dependencies (Site Event Logging, Health Monitoring)
---
### WP-15: CachedCall Idempotency Documentation
**Description**: Document that CachedCall idempotency is the caller's responsibility.
**Acceptance Criteria**:
- Script API documentation clearly states that `ExternalSystem.CachedCall()` idempotency is the caller's responsibility `[KDD-sf-4]`
- S&F engine makes no idempotency guarantees — duplicate delivery possible (especially on failover) `[CD-SF-7]`
**Estimated Complexity**: S
**Requirements Traced**: `[KDD-sf-4]`, `[CD-SF-7]`
---
### WP-16: Deployment Manager — Concurrent Template Editing Semantics
**Description**: Ensure last-write-wins semantics for template editing do not conflict with deployment pipeline.
**Acceptance Criteria**:
- Last-write-wins for concurrent template editing — no pessimistic locking or optimistic concurrency on templates `[3.9-6]`
- Deployment uses optimistic concurrency on deployment status records only `[KDD-deploy-11]`
- Template state at time of deployment is captured in the flattened config and revision hash
**Estimated Complexity**: S
**Requirements Traced**: `[3.9-6]`, `[KDD-deploy-11]`
---
## Test Strategy
### Unit Tests
| Area | Tests |
|------|-------|
| Deployment flow | Validate -> flatten -> send pipeline; validation failure stops pipeline |
| Deployment identity | Deployment ID generation uniqueness; revision hash propagation |
| Operation lock | Concurrent requests on same instance rejected; different instances proceed in parallel; lock released on completion/timeout |
| State transitions | All valid transitions succeed; all invalid transitions rejected with correct error messages |
| Deployment status | CRUD with optimistic concurrency; concurrent updates handled correctly |
| S&F message format | Serialization/deserialization of all three categories; all fields stored correctly |
| S&F retry logic | Fixed interval timing; per-source-entity settings respected; max retries triggers parking; transient-only filter |
| Parked message ops | Retry moves to queue; discard removes; query returns correct results |
| Template deletion constraint | Templates with instance references cannot be deleted; templates with child template references cannot be deleted |
### Integration Tests
| Area | Tests |
|------|-------|
| End-to-end deploy | Central sends deployment -> site compiles -> actors created -> success reported -> status updated |
| Deploy with validation failure | Template with compilation error -> deployment blocked before send |
| Idempotent deploy | Same deployment ID sent twice -> second returns "already applied" |
| Stale config rejection | Older revision hash sent -> site rejects with current version |
| Lifecycle commands | Disable -> verify subscriptions stopped and config retained; Enable -> verify instance re-activates; Delete -> verify actors destroyed and config removed |
| S&F buffer and retry | Submit message -> delivery fails -> buffered -> retry succeeds -> message removed |
| S&F parking | Submit message -> delivery fails -> max retries -> message parked |
| S&F replication | Buffer message on active -> verify replicated to standby SQLite |
| Parked message remote query | Central queries site for parked messages -> correct results returned |
| Parked message retry/discard | Central retries parked message -> moves to queue; Central discards -> removed |
| System-wide artifact deploy | Deploy shared scripts to multiple sites -> per-site status tracked |
| S&F survives deletion | Delete instance -> verify S&F messages still exist and deliver |
| S&F drains on disable | Disable instance -> verify pending S&F messages continue delivery |
### Negative Tests
| Requirement | Test |
|-------------|------|
| `[1.3-1]` Central does not buffer | Verify no S&F infrastructure exists on central; central deployment to unreachable site fails immediately |
| `[1.3-6]` No max buffer | Submit messages continuously -> verify no rejection based on count |
| `[3.8.1-7]` Delete fails if unreachable | Attempt delete when site offline -> verify failure; verify central does not mark as deleted |
| `[3.8.1-8]` Template deletion constraint | Attempt to delete template with active instances -> verify rejection |
| `[3.9-1]` No auto-propagation | Change template -> verify deployed instance unaffected |
| `[3.9-5]` No rollback | Verify no rollback mechanism exists; only current deployed state tracked |
| `[CD-DM-5]` Operation lock rejects | Send two concurrent deploys for same instance -> verify second rejected |
| `[CD-DM-7]` Invalid transitions | Attempt enable on already-enabled instance -> verify rejection; attempt disable on not-deployed -> verify rejection |
| `[CD-SF-3]` Permanent failures not buffered | Submit message with permanent failure classification -> verify not buffered, error returned to caller |
| `[KDD-sf-3]` Messages survive deletion | Delete instance -> verify S&F messages not cleared |
### Failover & Resilience Tests
| Scenario | Test |
|----------|------|
| Mid-deploy central failover | Deploy in progress -> kill central active -> verify deployment treated as failed -> re-query site state -> re-deploy succeeds |
| Mid-deploy site failover | Deploy in progress -> kill site active -> verify deployment times out or fails -> re-deploy to new active succeeds |
| Timeout + reconciliation | Deploy sent -> site applies but response lost -> central times out -> central queries site state -> finds "already applied" -> updates status |
| S&F buffer takeover | Buffer messages on active -> kill active -> standby takes over -> verify messages delivered from replicated copy |
| S&F replication gap | Buffer message -> immediately kill active (before replication) -> verify standby handles gap gracefully (missed message, no crash) |
| Site offline then online | Deploy to offline site -> fails -> site comes online -> re-deploy succeeds |
| System-wide artifact partial failure | Deploy artifacts to 3 sites, 1 offline -> verify 2 succeed -> retry failed site when online |
---
## Verification Gate
Phase 3C is complete when **all** of the following pass:
1. **Deployment pipeline end-to-end**: Central validates, flattens, sends, site compiles, creates actors, reports success. Status tracked in config DB.
2. **Idempotency**: Duplicate deployment ID returns "already applied." Stale revision hash rejected.
3. **Operation lock**: Concurrent operations on same instance rejected; parallel operations on different instances succeed.
4. **State transitions**: All valid transitions work; all invalid transitions rejected.
5. **Site-side atomicity**: Compilation failure rejects entire deployment; previous config unchanged.
6. **Lifecycle commands**: Disable/enable/delete work correctly with proper state effects.
7. **S&F buffering**: Messages buffered on transient failure, retried at fixed interval, parked after max retries.
8. **S&F replication**: Buffer operations replicated to standby; failover resumes delivery.
9. **Parked message management**: Central can query, retry, and discard parked messages at sites.
10. **S&F survival**: Messages persist through instance deletion and continue delivery.
11. **System-wide artifacts**: Deployed to all sites with per-site status; version skew tolerated.
12. **Resilience**: Mid-deploy failover, timeout+reconciliation, and S&F takeover tests pass.
13. **Audit logging**: All deployment and lifecycle actions recorded via IAuditService.
14. **All unit, integration, negative, and failover tests pass.**
---
## Open Questions
| # | Question | Context | Impact | Status |
|---|----------|---------|--------|--------|
| Q-P3C-1 | Should S&F retry timers be reset on failover or continue from the last known retry timestamp? | On failover, the new active node loads buffer from SQLite. Messages have `last_attempt_at` timestamps. Should retry timing continue relative to `last_attempt_at` or reset to "now"? | Affects retry behavior immediately after failover. Recommend: continue from `last_attempt_at` to avoid burst retries. | Open |
| Q-P3C-2 | What is the maximum number of parked messages returned in a single remote query? | Communication Layer pattern 8 uses 30s timeout. Very large parked message sets may need pagination. | Recommend: paginated query (e.g., 100 per page) consistent with Site Event Logging pagination pattern. | Open |
| Q-P3C-3 | Should the per-instance operation lock be in-memory (lost on central failover) or persisted? | In-memory is simpler and consistent with "in-progress deployments treated as failed on failover." Persisted lock could cause orphan locks. | Recommend: in-memory. On failover, all locks released. Site state query resolves any ambiguity. | Open |
---
## Orphan Check Result
### Forward Check (Requirements -> Work Packages)
Every item in the Requirements Checklist and Design Constraints Checklist was walked. Results:
| Checklist Item | Mapped To | Verified |
|---|---|---|
| `[1.3-1]` through `[1.3-7]` | WP-9, WP-10, WP-11 | Yes |
| `[1.4-1]` through `[1.4-4]` | WP-1, WP-5 | Yes |
| `[1.5-1]` through `[1.5-3]` | WP-7 | Yes |
| `[3.8.1-1]` through `[3.8.1-8]` | WP-4, WP-6, WP-12, WP-13 | Yes |
| `[3.9-1]`, `[3.9-2]`, `[3.9-3]`, `[3.9-5]`, `[3.9-6]` | WP-1, WP-8, WP-16 | Yes |
| `[3.9-4]` | WP-8 (staleness detection); diff UI deferred to Phase 6 | Yes |
| `[5.3-1]` through `[5.3-5]` | WP-10 | Yes |
| `[5.4-1]` through `[5.4-4]` | WP-12 | Yes |
| `[6.4-1]` through `[6.4-4]` | WP-10 | Yes |
| `[KDD-deploy-6]` | WP-2 | Yes |
| `[KDD-deploy-7]` | WP-3 | Yes |
| `[KDD-deploy-8]` | WP-5 | Yes |
| `[KDD-deploy-9]` | WP-7 | Yes |
| `[KDD-deploy-11]` | WP-4, WP-16 | Yes |
| `[KDD-sf-1]` | WP-10 | Yes |
| `[KDD-sf-2]` | WP-11 | Yes |
| `[KDD-sf-3]` | WP-6, WP-12, WP-13 | Yes |
| `[KDD-sf-4]` | WP-15 | Yes |
| `[CD-DM-1]` through `[CD-DM-18]` | WP-1 through WP-8 | Yes |
| `[CD-SF-1]` through `[CD-SF-12]` | WP-9 through WP-14 | Yes |
| `[CD-SR-1]` through `[CD-SR-6]` | WP-5, WP-6 | Yes |
| `[CD-COM-1]` through `[CD-COM-8]` | WP-1, WP-2, WP-6, WP-7, WP-12 | Yes |
**Forward check result: PASS — no orphan requirements.**
### Reverse Check (Work Packages -> Requirements)
Every work package traces to at least one requirement or design constraint:
| Work Package | Traces To |
|---|---|
| WP-1 | `[1.4-1]`, `[1.4-3]`, `[1.4-4]`, `[3.9-1]`, `[3.9-3]`, `[3.9-5]`, `[CD-DM-1]`, `[CD-DM-9]`, `[CD-DM-10]`, `[CD-COM-1]`, `[CD-COM-4]` |
| WP-2 | `[KDD-deploy-6]`, `[CD-DM-2]`, `[CD-DM-3]`, `[CD-DM-4]`, `[CD-COM-7]` |
| WP-3 | `[KDD-deploy-7]`, `[CD-DM-5]`, `[CD-DM-6]` |
| WP-4 | `[CD-DM-7]`, `[CD-DM-9]`, `[KDD-deploy-11]`, `[3.8.1-1]`, `[3.8.1-2]` |
| WP-5 | `[KDD-deploy-8]`, `[CD-DM-14]`, `[CD-SR-1]`, `[CD-SR-2]`, `[CD-SR-6]`, `[1.4-2]` |
| WP-6 | `[3.8.1-1]` through `[3.8.1-8]`, `[KDD-sf-3]`, `[CD-DM-16]` through `[CD-DM-18]`, `[CD-SR-3]` through `[CD-SR-5]`, `[CD-COM-2]`, `[CD-COM-5]`, `[CD-COM-7]` |
| WP-7 | `[1.5-1]` through `[1.5-3]`, `[KDD-deploy-9]`, `[CD-DM-8]`, `[CD-DM-13]`, `[CD-DM-15]`, `[CD-COM-3]`, `[CD-COM-6]` |
| WP-8 | `[3.9-2]`, `[3.9-4]`, `[CD-DM-11]`, `[CD-DM-12]` |
| WP-9 | `[1.3-1]`, `[1.3-3]`, `[1.3-6]`, `[CD-SF-1]`, `[CD-SF-4]`, `[CD-SF-11]` |
| WP-10 | `[1.3-5]`, `[1.3-7]`, `[5.3-1]` through `[5.3-5]`, `[6.4-1]` through `[6.4-4]`, `[KDD-sf-1]`, `[CD-SF-2]`, `[CD-SF-3]`, `[CD-SF-12]` |
| WP-11 | `[1.3-2]`, `[1.3-4]`, `[1.3-5]`, `[KDD-sf-2]`, `[CD-SF-5]`, `[CD-SF-6]`, `[CD-SF-7]` |
| WP-12 | `[5.4-1]` through `[5.4-4]`, `[KDD-sf-3]`, `[CD-SF-8]`, `[CD-SF-9]`, `[CD-SF-10]`, `[CD-COM-8]`, `[3.8.1-6]` |
| WP-13 | `[3.8.1-4]`, `[3.8.1-6]`, `[KDD-sf-3]`, `[CD-SF-10]` |
| WP-14 | `[CD-SF-1]`, docs/requirements/Component-StoreAndForward.md Dependencies |
| WP-15 | `[KDD-sf-4]`, `[CD-SF-7]` |
| WP-16 | `[3.9-6]`, `[KDD-deploy-11]` |
**Reverse check result: PASS — no untraceable work packages.**
### Split-Section Check
| Section | Phase 3C Covers | Other Phase Covers | Gap? |
|---|---|---|---|
| 1.4 | `[1.4-1]` through `[1.4-4]` (all bullets — backend pipeline) | Phase 6: deployment UI triggers and status display | No gap |
| 1.5 | `[1.5-1]` through `[1.5-3]` (all bullets — backend pipeline) | Phase 6: artifact deployment UI | No gap |
| 3.8.1 | `[3.8.1-1]` through `[3.8.1-8]` (all bullets — backend commands) | Phase 4: lifecycle command UI | No gap |
| 3.9 | `[3.9-1]`, `[3.9-2]`, `[3.9-3]`, `[3.9-5]`, `[3.9-6]` | Phase 6: `[3.9-4]` (diff view UI), deployment trigger UI | No gap |
| 5.3 | `[5.3-1]` through `[5.3-5]` (S&F engine) | Phase 7: External System Gateway delivery integration, error classification | No gap |
| 5.4 | `[5.4-1]` through `[5.4-4]` (backend query/command handling) | Phase 6: parked message management UI | No gap |
| 6.4 | `[6.4-1]` through `[6.4-4]` (S&F engine) | Phase 7: Notification Service delivery integration | No gap |
**Split-section check result: PASS — no unowned bullets.**
### Negative Requirement Check
| Negative Requirement | Acceptance Criterion | Adequate? |
|---|---|---|
| `[1.3-1]` Central does not buffer | Test verifies no S&F infrastructure on central; unreachable site = immediate failure | Yes |
| `[1.3-6]` No maximum buffer size | Test submits messages continuously, verifies no count-based rejection | Yes |
| `[3.8.1-6]` S&F messages not cleared on deletion | Test deletes instance, verifies messages still exist and deliver | Yes |
| `[3.8.1-7]` Delete fails if unreachable | Test attempts delete to offline site, verifies failure and central status unchanged | Yes |
| `[3.8.1-8]` Templates cannot be deleted with references | Test attempts deletion of referenced template, verifies rejection | Yes |
| `[3.9-1]` Changes not auto-propagated | Test changes template, verifies deployed instance unchanged | Yes |
| `[3.9-5]` No rollback | Verifies no rollback mechanism; only current state tracked | Yes |
| `[CD-SF-3]` Permanent failures not buffered | Test submits permanent failure, verifies not queued | Yes |
**Negative requirement check result: PASS — all prohibitions have verification criteria.**
---
## Codex MCP Verification
**Model**: gpt-5.4
**Result**: Pass with corrections
### Step 1 — Requirements Coverage Review
Codex identified 10 findings. Disposition:
| # | Finding | Disposition |
|---|---------|-------------|
| 1 | Naming collision detection and device tag resolution exclusion missing from WP-1 | **Corrected** — added naming collision detection to WP-1 acceptance criteria; added explicit exclusion of device tag resolution. |
| 2 | Shared script pre-compilation validation missing from WP-7 | **Corrected** — added shared script validation acceptance criterion to WP-7. |
| 3 | Role overlap (user may hold both Design+Deployment) not verified | **Dismissed** — this is a Phase 1 Security & Auth concern. Phase 3C assumes the auth model works correctly. Role overlap is tested in Phase 1 integration tests. |
| 4 | WP-4 traces [3.8.1-2] but doesn't verify runtime activation | **Dismissed** — WP-4 owns the state transition matrix. Runtime behavior of "enabled" (subscriptions, triggers, alarms running) is the responsibility of Phase 3B Site Runtime, which creates Instance Actors with full initialization. WP-6 verifies enable recreates the actor. |
| 5 | Enable flow underspecified (should verify actor recreation with subscriptions) | **Corrected** — expanded WP-6 enable criteria to explicitly verify actor creation, subscription restoration, script triggers, and alarm evaluation. |
| 6 | Command ID described as "correlation" but source says "deduplication" | **Corrected** — changed wording to "deduplication" with acceptance criterion that duplicate commands are recognized and not re-applied. |
| 7 | Disable/enable unreachable failure not explicitly covered | **Corrected** — added acceptance criterion that disable and enable fail immediately if site unreachable. |
| 8 | Diff "show" requirement only partially verified (compute, not expose) | **Dismissed** — Phase 3C provides the backend API for diff computation and staleness detection. The "show" (UI) aspect is explicitly deferred to Phase 6 per the split-section note. WP-8 correctly scopes to backend. |
| 9 | Parked message management UI not verified | **Dismissed** — same as #8. Phase 3C builds the site-side backend (query handler, retry/discard commands). Phase 6 builds the central UI. Split documented in plan. |
| 10 | "near-complete copy" weakens HighLevelReqs "seamless" wording | **Corrected** — updated WP-11 to reference [1.3-4] for the seamless takeover requirement, with a note that [CD-SF-7] acknowledges the async replication trade-off (rare duplicates/misses). The component design explicitly documents this as an acceptable trade-off; HighLevelReqs 1.3 bullet 4 does not preclude it since "seamlessly" refers to the takeover process, not data completeness. |
### Step 2 — Negative Requirement Review
Not submitted separately; negative requirements were included in Step 1 review. All negative requirements have adequate acceptance criteria per the orphan check.
### Step 3 — Split-Section Gap Review
Not submitted separately; split sections were documented in the plan and reviewed in Step 1. No gaps identified.