docs: align internal docs to enterprise standards
All checks were successful
CI / verify (push) Successful in 2m33s

Add canonical operations/security/access/feature docs and fix path integrity to improve onboarding and incident readiness.
This commit is contained in:
Joseph Doherty
2026-02-20 13:23:55 -05:00
parent e6d81f6350
commit ce727eb30d
18 changed files with 783 additions and 186 deletions

61
docs/deployment.md Normal file
View File

@@ -0,0 +1,61 @@
# Deployment
This document defines the canonical CBDDC deployment workflow, validation gates, and rollback procedures.
## Environments
- Local development: single node for functional verification.
- Staging: multi-node mesh or hosted mode with production-like configuration.
- Production: controlled rollout with health checks, monitoring, and rollback readiness.
## Promotion Workflow
1. Build and test in CI (`dotnet build`, `dotnet test`).
2. Validate configuration and secrets for the target environment.
3. Deploy to staging.
4. Run smoke checks:
- Node starts and joins expected peers.
- `/health` reports healthy.
- Sync operations replicate across target collections.
5. Promote to production using approved change window.
## Validation Gates
- No failed test suites.
- No unresolved critical incidents.
- Health check status is healthy or approved degraded state with mitigation.
- Backup/restore path confirmed for the active persistence provider.
## Rollback Triggers
Rollback immediately when any of the following occurs:
- Sync correctness regression (missed or duplicate replication events).
- Persistent unhealthy status after remediation attempt.
- Authentication or connectivity failure across required peers.
- Data integrity concern reported by validation checks.
## Rollback Procedure
1. Stop traffic or disable rollout to newly deployed nodes.
2. Revert to the last known-good build.
3. Restore previous configuration and secret set.
4. If needed, restore persistence snapshot/backup.
5. Re-run health and replication smoke checks.
6. Record incident details in the runbook timeline.
## Emergency Changes
Use emergency deployment only for incident containment:
1. Capture incident reference and approver.
2. Apply minimal scoped change.
3. Run abbreviated smoke checks (startup, health, critical replication path).
4. Follow up with standard post-incident review.
## Related Documents
- [Deployment Modes](deployment-modes.md)
- [Deployment (LAN)](deployment-lan.md)
- [Production Hardening](production-hardening.md)
- [Runbook](runbook.md)