Files
CBDDC/docs/deployment.md
Joseph Doherty ce727eb30d
All checks were successful
CI / verify (push) Successful in 2m33s
docs: align internal docs to enterprise standards
Add canonical operations/security/access/feature docs and fix path integrity to improve onboarding and incident readiness.
2026-02-20 13:23:55 -05:00

2.0 KiB

Deployment

This document defines the canonical CBDDC deployment workflow, validation gates, and rollback procedures.

Environments

  • Local development: single node for functional verification.
  • Staging: multi-node mesh or hosted mode with production-like configuration.
  • Production: controlled rollout with health checks, monitoring, and rollback readiness.

Promotion Workflow

  1. Build and test in CI (dotnet build, dotnet test).
  2. Validate configuration and secrets for the target environment.
  3. Deploy to staging.
  4. Run smoke checks:
    • Node starts and joins expected peers.
    • /health reports healthy.
    • Sync operations replicate across target collections.
  5. Promote to production using approved change window.

Validation Gates

  • No failed test suites.
  • No unresolved critical incidents.
  • Health check status is healthy or approved degraded state with mitigation.
  • Backup/restore path confirmed for the active persistence provider.

Rollback Triggers

Rollback immediately when any of the following occurs:

  • Sync correctness regression (missed or duplicate replication events).
  • Persistent unhealthy status after remediation attempt.
  • Authentication or connectivity failure across required peers.
  • Data integrity concern reported by validation checks.

Rollback Procedure

  1. Stop traffic or disable rollout to newly deployed nodes.
  2. Revert to the last known-good build.
  3. Restore previous configuration and secret set.
  4. If needed, restore persistence snapshot/backup.
  5. Re-run health and replication smoke checks.
  6. Record incident details in the runbook timeline.

Emergency Changes

Use emergency deployment only for incident containment:

  1. Capture incident reference and approver.
  2. Apply minimal scoped change.
  3. Run abbreviated smoke checks (startup, health, critical replication path).
  4. Follow up with standard post-incident review.