# Phase 6: Deployment Operations & Troubleshooting UI **Date**: 2026-03-16 **Status**: Plan complete **Goal**: Complete the operational loop — deploy, diagnose, troubleshoot from central. --- ## Scope **Components**: Central UI (deployment + troubleshooting workflows) **Features**: - Staleness indicators (revision hash comparison) - Diff view (added/removed/changed) - Deploy with pre-validation gating - Deployment status tracking (live SignalR) - System-wide artifact deployment with per-site status matrix - Debug view (instance selection, snapshot + live stream via SignalR) - Site event log viewer (remote query with filters, pagination, keyword search) - Parked message management (query, retry, discard) - Audit log viewer (query with filters) --- ## Prerequisites | Phase | What must be complete | |-------|-----------------------| | Phase 1 | Central UI Blazor Server shell, login, route protection, Security & Auth, Configuration Database, IAuditService | | Phase 2 | Template Engine: flattening, diff calculation, validation, revision hashing | | Phase 3A | Cluster Infrastructure, Site Runtime Deployment Manager singleton | | Phase 3B | Communication Layer (all 8 patterns), Health Monitoring, Site Event Logging, site-wide Akka stream | | Phase 3C | Deployment Manager (full pipeline), Store-and-Forward Engine (full) | | Phase 4 | Operator/Admin UI: health dashboard, instance list, deployment status view (basic) | | Phase 5 | Design-time authoring UI (templates, instances, definitions) | --- ## Requirements Checklist ### Section 1.4 — Deployment Behavior (UI portion) - [ ] `[1.4-1-ui]` Site applies config immediately upon receipt — deployment status reflects this (no confirmation step in UI) - [ ] `[1.4-3-ui]` Site reports back success/failure — UI shows deployment result - [ ] `[1.4-4-ui]` Pre-deployment validation runs before deployment — UI displays validation errors and blocks deployment ### Section 1.5 — System-Wide Artifact Deployment (UI portion) - [ ] `[1.5-1-ui]` Changes not automatically propagated — UI shows separate "Deploy Artifacts" action - [ ] `[1.5-2-ui]` Deployment requires explicit action by Deployment role — UI enforces role check - [ ] `[1.5-3-ui]` Design role manages definitions; Deployment role triggers deployment — clear separation in UI ### Section 3.9 — Template Deployment & Change Propagation (UI portion) - [ ] `[3.9-1-ui]` Template changes not automatically propagated — staleness indicators show which instances are out of date - [ ] `[3.9-2-ui]` Two views: deployed vs. template-derived — UI enables comparison - [ ] `[3.9-3-ui]` Deployment at individual instance level — UI provides per-instance deploy action - [ ] `[3.9-4-ui]` Show differences between deployed and template-derived config — diff view - [ ] `[3.9-5-ui]` No rollback — UI does not offer rollback action ### Section 5.4 — Parked Message Management (UI portion) - [ ] `[5.4-1-ui]` Parked messages stored at site — UI queries sites remotely - [ ] `[5.4-2-ui]` Central UI can query sites for parked messages — query UI - [ ] `[5.4-3-ui]` Operators can retry or discard parked messages — action buttons - [ ] `[5.4-4-ui]` Covers external system calls, notifications, and cached database writes — all three categories shown ### Section 8 — Central UI (deployment + troubleshooting workflows, Phase 6 owns) - [ ] `[8-deploy-1]` Deployment: View diffs between deployed and current template-derived configurations - [ ] `[8-deploy-2]` Deployment: Deploy updates to individual instances - [ ] `[8-deploy-3]` Deployment: Filter instances by area - [ ] `[8-deploy-4]` Deployment: Pre-deployment validation runs automatically — errors block deployment - [ ] `[8-deploy-5]` System-Wide Artifact Deployment: explicitly deploy shared scripts, external system definitions, DB connection definitions, notification lists to all sites - [ ] `[8-deploy-6]` Deployment Status Monitoring: Track deployment success/failure at site level - [ ] `[8-deploy-7]` Parked Message Management: Query sites, view details, retry or discard - [ ] `[8-deploy-8]` Site Event Log Viewer: Query and view operational event logs from sites ### Section 8.1 — Debug View - [ ] `[8.1-1]` Subscribe-on-demand — central subscribes to site-wide Akka stream filtered by instance - [ ] `[8.1-2]` Site provides initial snapshot of all current attribute values and alarm states - [ ] `[8.1-3]` Attribute value stream: [InstanceUniqueName].[AttributePath].[AttributeName], value, quality, timestamp - [ ] `[8.1-4]` Alarm state stream: [InstanceUniqueName].[AlarmName], state (active/normal), priority, timestamp - [ ] `[8.1-5]` Stream continues until engineer closes debug view — central unsubscribes - [ ] `[8.1-6]` No attribute/alarm selection — always shows all for the instance - [ ] `[8.1-7]` No special concurrency limits required ### Section 10.1–10.3 — Audit Log (UI portion) - [ ] `[10.1-ui]` Audit logs stored in config DB — UI queries config DB - [ ] `[10.2-ui]` All system-modifying actions logged — viewer covers all categories - [ ] `[10.3-ui]` Each entry: who, what (action, entity type, entity ID, entity name), when, state (JSON after-state) — UI displays all fields - [ ] `[10.3-2-ui]` Change history reconstructed by comparing consecutive entries — UI shows before/after by comparing entries ### Section 12.3 — Central Access to Event Logs - [ ] `[12.3-1]` Central UI can query site event logs remotely via Communication Layer - [ ] `[12.3-2]` Queries support filtering by event type, time range, instance, severity, keyword search - [ ] `[12.3-3]` Results are paginated (default 500 per page) with continuation token --- ## Design Constraints Checklist | ID | Constraint | Source | Mapped WP | |----|-----------|--------|-----------| | KDD-ui-1 | Blazor Server (ASP.NET Core + SignalR), Bootstrap, clean corporate design | CLAUDE.md | All WPs | | KDD-ui-2 | Real-time push for debug view, health dashboard, deployment status | CLAUDE.md | WP-4, WP-6 | | KDD-deploy-5 | Flattened configs include revision hash for staleness detection | CLAUDE.md | WP-1 | | KDD-deploy-9 | System-wide artifact version skew across sites supported | CLAUDE.md | WP-5 | | KDD-deploy-11 | Optimistic concurrency on deployment status records | CLAUDE.md | WP-4 | | CD-DM-1 | Diff shows added/removed/changed attributes, alarms, scripts, connection binding changes | Component-DeploymentManager | WP-2 | | CD-DM-2 | Per-site result matrix for system-wide artifact deployment; successful sites not rolled back | Component-DeploymentManager | WP-5 | | CD-DM-3 | Retry failed sites individually after system-wide artifact deployment | Component-DeploymentManager | WP-5 | | CD-DM-4 | Central UI indicates which sites have pending artifact updates | Component-DeploymentManager | WP-5 | | CD-COMM-1 | Debug streams lost on failover — must be re-opened by user | Component-Communication | WP-6 | | CD-COMM-2 | Debug view: subscribe → snapshot → stream → unsubscribe pattern | Component-Communication | WP-6 | | CD-SEL-1 | Event log queries paginated with continuation token (500/page default) | Component-SiteEventLogging | WP-7 | | CD-SEL-2 | Keyword search on message and source fields (SQLite LIKE) | Component-SiteEventLogging | WP-7 | | CD-SEL-3 | Event log filters: event type, time range, instance ID, severity | Component-SiteEventLogging | WP-7 | | CD-SF-1 | Parked message details: target, payload, retry count, timestamps | Component-StoreAndForward | WP-8 | | CD-AUD-1 | Audit log filter: user, entity type, action type, time range | Component-CentralUI | WP-9 | | CD-AUD-2 | Before/after state by comparing consecutive entries | Component-CentralUI | WP-9 | --- ## Work Packages ### WP-1: Staleness Indicators (S) **Description**: Show which instances have out-of-date deployed configurations by comparing revision hashes. **Acceptance Criteria**: - Instance list shows staleness indicator (e.g., icon/badge) when deployed revision hash differs from current template-derived revision hash (`[3.9-1-ui]`, `KDD-deploy-5`) - Two views accessible: deployed configuration and template-derived configuration (`[3.9-2-ui]`) - Staleness detection does not require a full diff — uses revision hash comparison only (`KDD-deploy-5`) - Filter/sort by staleness state **Complexity**: S **Traces**: `[3.9-1-ui]`, `[3.9-2-ui]`, `[3.9-5-ui]`, KDD-deploy-5 --- ### WP-2: Diff View (M) **Description**: Display differences between the deployed configuration and the current template-derived configuration. **Acceptance Criteria**: - Diff view shows added, removed, and changed members (attributes, alarms, scripts) (`[3.9-4-ui]`, `[8-deploy-1]`, `CD-DM-1`) - Connection binding changes shown in diff (`CD-DM-1`) - Clear visual distinction between additions (new members), removals, and modifications - Diff calculated on demand when user views it **Complexity**: M **Traces**: `[3.9-4-ui]`, `[8-deploy-1]`, CD-DM-1 --- ### WP-3: Deploy with Pre-Validation Gating (M) **Description**: Deploy action on individual instances that automatically runs pre-deployment validation and blocks on errors. **Acceptance Criteria**: - Deploy action available per instance (`[3.9-3-ui]`, `[8-deploy-2]`) - Pre-deployment validation runs automatically before deployment is sent (`[1.4-4-ui]`, `[8-deploy-4]`) - Validation errors displayed clearly and block the deployment - Filter instances by site, area, template (`[8-deploy-3]`) - Site applies config immediately — no confirmation step shown in UI for site side (`[1.4-1-ui]`) - No rollback action offered (`[3.9-5-ui]`) - Deployment role required **Complexity**: M **Traces**: `[1.4-1-ui]`, `[1.4-4-ui]`, `[3.9-3-ui]`, `[3.9-5-ui]`, `[8-deploy-2]`, `[8-deploy-3]`, `[8-deploy-4]` --- ### WP-4: Deployment Status Tracking (Live SignalR) (M) **Description**: Real-time deployment status updates pushed to the UI via SignalR. **Acceptance Criteria**: - Deployment status (pending, in-progress, success, failed) updates in real-time via SignalR push (`KDD-ui-2`, `[8-deploy-6]`) - Site reports success/failure — UI reflects result (`[1.4-3-ui]`) - Optimistic concurrency on status records handled gracefully (`KDD-deploy-11`) - Status shown per instance with timestamp - No manual refresh required **Complexity**: M **Traces**: `[1.4-3-ui]`, `[8-deploy-6]`, KDD-ui-2, KDD-deploy-11 --- ### WP-5: System-Wide Artifact Deployment with Per-Site Status Matrix (L) **Description**: UI for deploying shared scripts, external system definitions, DB connection definitions, and notification lists to all sites. **Acceptance Criteria**: - Separate "Deploy Artifacts" action — not automatically triggered when definitions change (`[1.5-1-ui]`, `[8-deploy-5]`) - Deployment role required (`[1.5-2-ui]`) - Design role manages definitions; Deployment role triggers deployment — clear separation (`[1.5-3-ui]`) - Per-site status matrix showing success/failure for each site (`CD-DM-2`) - Successful sites not rolled back if others fail (`CD-DM-2`) - Individual site retry for failed sites (`CD-DM-3`) - UI indicates which sites have pending artifact updates (`CD-DM-4`) - Cross-site version skew supported — display shows version status per site (`KDD-deploy-9`) **Complexity**: L **Traces**: `[1.5-1-ui]`–`[1.5-3-ui]`, `[8-deploy-5]`, KDD-deploy-9, CD-DM-2, CD-DM-3, CD-DM-4 --- ### WP-6: Debug View (L) **Description**: On-demand real-time view of a specific instance's attribute values and alarm states streamed via SignalR. **Acceptance Criteria**: - Select a deployed instance and open debug view (`[8.1-1]`) - Initial snapshot of all current attribute values and alarm states received from site (`[8.1-2]`) - Attribute value stream formatted as `[InstanceUniqueName].[AttributePath].[AttributeName]`, value, quality, timestamp (`[8.1-3]`) - Alarm state stream formatted as `[InstanceUniqueName].[AlarmName]`, state, priority, timestamp (`[8.1-4]`) - Live updates pushed via SignalR — no polling (`KDD-ui-2`) - Stream continues until user closes the debug view; central unsubscribes on close (`[8.1-5]`) - All attributes and alarms shown — no selection filtering (`[8.1-6]`) - No concurrency limits enforced (`[8.1-7]`) - On failover, debug stream is lost; user must re-open (`CD-COMM-1`) - Subscribe → snapshot → stream → unsubscribe lifecycle (`CD-COMM-2`) - Deployment role required **Complexity**: L **Traces**: `[8.1-1]`–`[8.1-7]`, KDD-ui-2, CD-COMM-1, CD-COMM-2 --- ### WP-7: Site Event Log Viewer (M) **Description**: UI for querying and viewing operational event logs from site clusters remotely. **Acceptance Criteria**: - Remote query to sites via Communication Layer (`[12.3-1]`, `[8-deploy-8]`) - Filter by event type/category, time range, instance ID, severity (`CD-SEL-3`, `[12.3-2]`) - Keyword search on message and source fields (`CD-SEL-2`, `[12.3-2]`) - Paginated results with continuation token support (default 500/page) (`CD-SEL-1`, `[12.3-3]`) - Display all event categories: script executions (start, complete, error), alarm events (activated, cleared, evaluation errors), deployment events (received, compiled, applied, failed), connection status changes, S&F activity (queued, delivered, retried, parked), instance lifecycle (enable, disable, delete) - Deployment role required **Complexity**: M **Traces**: `[12.3-1]`–`[12.3-3]`, `[8-deploy-8]`, CD-SEL-1, CD-SEL-2, CD-SEL-3 --- ### WP-8: Parked Message Management (M) **Description**: UI for querying, viewing, retrying, and discarding parked messages at sites. **Acceptance Criteria**: - Query sites for parked messages remotely (`[5.4-1-ui]`, `[5.4-2-ui]`, `[8-deploy-7]`) - View message details: target, payload, retry count, timestamps (`CD-SF-1`) - All three message categories shown: external system calls, notifications, cached database writes (`[5.4-4-ui]`) - Retry action moves message back to retry queue (`[5.4-3-ui]`) - Discard action removes message permanently (`[5.4-3-ui]`) - Deployment role required **Complexity**: M **Traces**: `[5.4-1-ui]`–`[5.4-4-ui]`, `[8-deploy-7]`, CD-SF-1 --- ### WP-9: Audit Log Viewer (M) **Description**: UI for querying the central audit log with filters. **Acceptance Criteria**: - Query audit log from configuration database (`[10.1-ui]`) - All system-modifying action categories visible (`[10.2-ui]`) - Each entry displays: who (user), what (action, entity type, entity ID, entity name), when (timestamp), state (JSON after-state) (`[10.3-ui]`) - Filter by user, entity type, action type, time range (`CD-AUD-1`) - Before/after state comparison by viewing consecutive entries for the same entity (`[10.3-2-ui]`, `CD-AUD-2`) - Admin role required **Complexity**: M **Traces**: `[10.1-ui]`–`[10.3-2-ui]`, CD-AUD-1, CD-AUD-2 --- ## Test Strategy ### Unit Tests - Staleness indicator rendering based on revision hash comparison - Diff view component rendering for added/removed/changed members - Deployment status SignalR update handling - Debug view snapshot rendering and stream update handling - Event log filter building and pagination logic - Parked message action button state logic - Audit log filter building and entry rendering ### Integration Tests - Deploy workflow: view diff → validate → deploy → track status via SignalR → verify success - Deploy with validation failure → verify deployment blocked - System-wide artifact deployment → verify per-site status matrix → retry failed site - Debug view: open → receive snapshot → receive stream updates → close → verify unsubscribe - Event log viewer: query with filters → paginate → verify results match - Parked message: query → retry → verify message moves back to queue; query → discard → verify removed - Audit log: query with filters → verify entries displayed with correct detail ### Negative Tests - Attempt deploy on instance with validation errors → verify blocked - No rollback action exists in UI → verify absent - Non-Deployment user attempts deploy → verify access denied - Non-Admin user attempts audit log viewer → verify access denied - Debug view during failover → verify stream lost, user must re-open - Query event log on unreachable site → verify graceful error --- ## Verification Gate Phase 6 is complete when: 1. All 9 work packages pass acceptance criteria 2. All unit and integration tests pass 3. All negative tests verify prohibited behaviors 4. A Deployment user can perform a full operational loop: view stale instances → view diff → deploy → track live status → open debug view → view event logs → manage parked messages 5. An Admin user can query the audit log with filters and view change details 6. Real-time features (deployment status, debug view) work via SignalR without polling 7. System-wide artifact deployment shows per-site status matrix with retry capability --- ## Open Questions No new questions discovered during Phase 6 plan generation. --- ## Split-Section Verification | Section | Phase 6 Bullets | Other Phase(s) | Other Phase Bullets | |---------|----------------|-----------------|---------------------| | 1.4 | `[1.4-1-ui]`, `[1.4-3-ui]`, `[1.4-4-ui]` (UI) | Phase 3C | `[1.4-1]`–`[1.4-4]` backend pipeline | | 1.5 | `[1.5-1-ui]`–`[1.5-3-ui]` (UI) | Phase 3C | Backend artifact deployment | | 3.9 | `[3.9-1-ui]`–`[3.9-5-ui]`, `[3.9-6]` in Phase 5 | Phase 3C | Backend pipeline, status persistence | | 5.4 | `[5.4-1-ui]`–`[5.4-4-ui]` (UI) | Phase 3C | Backend parked message storage and management | | 8 | `[8-deploy-1]`–`[8-deploy-8]` | Phase 4, 5 | Admin/operator, design workflows | | 8.1 | `[8.1-1]`–`[8.1-7]` (all) | — | No split (Phase 6 owns entirely) | | 10.1–10.3 | `[10.1-ui]`–`[10.3-2-ui]` (viewer UI) | Phase 1 | Backend storage, IAuditService, transactional guarantee | | 12.3 | `[12.3-1]`–`[12.3-3]` (all) | — | No split (Phase 6 owns entirely) | --- ## Orphan Check Result **Forward check**: Every Requirements Checklist item and Design Constraints Checklist item maps to at least one work package with acceptance criteria that would fail if the requirement were not implemented. PASS. **Reverse check**: Every work package traces back to at least one requirement or design constraint. No untraceable work. PASS. **Split-section check**: All split sections verified above. Phase 6 covers UI presentation for deployment/operations workflows. Backend functionality is in Phase 3C (deployment pipeline, S&F) and Phase 1 (audit service). No unassigned bullets found. PASS. **Negative requirement check**: The following negative requirements have explicit acceptance criteria: - `[3.9-5-ui]` No rollback — verified in WP-3 (no rollback action offered) - `[1.5-1-ui]` Not automatically propagated — verified in WP-5 (separate action required) - `[8.1-7]` No concurrency limits — verified in WP-6 PASS. **Codex MCP verification**: Skipped — external tool verification deferred.