docs: align internal docs to enterprise standards
All checks were successful
CI / verify (push) Successful in 2m33s
All checks were successful
CI / verify (push) Successful in 2m33s
Add canonical operations/security/access/feature docs and fix path integrity to improve onboarding and incident readiness.
This commit is contained in:
61
docs/deployment.md
Normal file
61
docs/deployment.md
Normal file
@@ -0,0 +1,61 @@
|
||||
# Deployment
|
||||
|
||||
This document defines the canonical CBDDC deployment workflow, validation gates, and rollback procedures.
|
||||
|
||||
## Environments
|
||||
|
||||
- Local development: single node for functional verification.
|
||||
- Staging: multi-node mesh or hosted mode with production-like configuration.
|
||||
- Production: controlled rollout with health checks, monitoring, and rollback readiness.
|
||||
|
||||
## Promotion Workflow
|
||||
|
||||
1. Build and test in CI (`dotnet build`, `dotnet test`).
|
||||
2. Validate configuration and secrets for the target environment.
|
||||
3. Deploy to staging.
|
||||
4. Run smoke checks:
|
||||
- Node starts and joins expected peers.
|
||||
- `/health` reports healthy.
|
||||
- Sync operations replicate across target collections.
|
||||
5. Promote to production using approved change window.
|
||||
|
||||
## Validation Gates
|
||||
|
||||
- No failed test suites.
|
||||
- No unresolved critical incidents.
|
||||
- Health check status is healthy or approved degraded state with mitigation.
|
||||
- Backup/restore path confirmed for the active persistence provider.
|
||||
|
||||
## Rollback Triggers
|
||||
|
||||
Rollback immediately when any of the following occurs:
|
||||
|
||||
- Sync correctness regression (missed or duplicate replication events).
|
||||
- Persistent unhealthy status after remediation attempt.
|
||||
- Authentication or connectivity failure across required peers.
|
||||
- Data integrity concern reported by validation checks.
|
||||
|
||||
## Rollback Procedure
|
||||
|
||||
1. Stop traffic or disable rollout to newly deployed nodes.
|
||||
2. Revert to the last known-good build.
|
||||
3. Restore previous configuration and secret set.
|
||||
4. If needed, restore persistence snapshot/backup.
|
||||
5. Re-run health and replication smoke checks.
|
||||
6. Record incident details in the runbook timeline.
|
||||
|
||||
## Emergency Changes
|
||||
|
||||
Use emergency deployment only for incident containment:
|
||||
|
||||
1. Capture incident reference and approver.
|
||||
2. Apply minimal scoped change.
|
||||
3. Run abbreviated smoke checks (startup, health, critical replication path).
|
||||
4. Follow up with standard post-incident review.
|
||||
|
||||
## Related Documents
|
||||
|
||||
- [Deployment Modes](deployment-modes.md)
|
||||
- [Deployment (LAN)](deployment-lan.md)
|
||||
- [Production Hardening](production-hardening.md)
|
||||
- [Runbook](runbook.md)
|
||||
Reference in New Issue
Block a user