dohertj2/CBDD

Files

Joseph Doherty 52445078a1 Add enterprise docs structure and include pending core maintenance updates.

2026-02-20 13:28:29 -05:00

1.9 KiB

Raw Permalink Blame History

Runbook

Purpose

This runbook provides standard operations, incident triage, escalation, and recovery procedures for CBDD maintainers.

Signals And Entry Points

CI failures on main
Failing integration tests in consumer repositories
Regression issues labeled incident
Recovery or corruption reports from consumers

Alert Triage Procedure

Capture incident context: version, environment, failing operation, and first failure timestamp.
Classify severity:

SEV-1: data loss risk, persistent startup failure, or transaction correctness risk.
SEV-2: feature-level regression without confirmed data loss.
SEV-3: non-critical behavior or documentation defects.

Create or update the incident issue with owner and current mitigation status.
Reproduce with targeted tests in /Users/dohertj2/Desktop/CBDD/tests/CBDD.Tests.

Diagnostics

Validate build and tests.

dotnet test CBDD.slnx -c Release

Run coverage threshold gate when behavior changed in core paths.

bash scripts/coverage-check.sh

For storage and recovery incidents, prioritize:

StorageEngine.Recovery
WriteAheadLog
transaction protocol tests

Escalation Path

Initial owner: maintainer on incident issue.
Escalate to release maintainer when severity is SEV-1 or rollback is required.
Communicate status updates on each milestone: triage complete, mitigation active, fix merged, validation complete.

Recovery Actions

Contain impact by pinning consumers to last known-good package version.
Apply rollback steps from deployment.md.
Validate repaired build with targeted and full regression suites.
Publish fixed package and confirm consumer recovery.

Post-Incident Expectations

Document root cause, blast radius, and timeline.
Add regression tests to prevent recurrence.
Record follow-up actions in issue tracker with owners and due dates.