167 lines
7.8 KiB
Markdown
167 lines
7.8 KiB
Markdown
# Plan: Peer-Confirmed Oplog Retention and Peer Sync Health
|
|
|
|
## Objective
|
|
Move from time-only oplog pruning to peer-confirmed pruning so entries are not removed until peers have confirmed them, while also adding peer lifecycle management and health visibility.
|
|
|
|
## Requested Outcomes
|
|
- Oplogs are not cleared until confirmed by peers.
|
|
- Each node tracks latest oplog confirmation per peer.
|
|
- New peers are automatically registered when they join.
|
|
- Deprecated peers can be explicitly removed from tracking.
|
|
- Hosting health checks report peer sync status, not only store availability.
|
|
|
|
## Current Baseline (Codebase)
|
|
- Pruning is retention-time based in `SyncOrchestrator` + `IOplogStore.PruneOplogAsync(...)`.
|
|
- Push ACK is binary success/fail (`AckResponse`) and does not persist peer confirmation state.
|
|
- Peer discovery exists (`IDiscoveryService`, `UdpDiscoveryService`, `CompositeDiscoveryService`).
|
|
- Persistent remote peer config exists (`IPeerConfigurationStore`, `PeerManagementService`).
|
|
- Hosting health check only validates oplog access (`CBDDCHealthCheck` in Hosting project).
|
|
|
|
## Design Decisions
|
|
- Track confirmation as a persisted watermark per `(peerNodeId, sourceNodeId)` using HLC and hash.
|
|
- Use existing vector-clock exchange and successful push results to advance confirmations (no mandatory wire protocol break required).
|
|
- Treat tracked peers as pruning blockers until explicitly removed.
|
|
- Keep peer registration idempotent and safe for repeated discovery events.
|
|
|
|
## Data Model and Persistence Plan
|
|
### 1. Add peer confirmation tracking model
|
|
Create a new persisted model (example name: `PeerOplogConfirmation`) with fields:
|
|
- `PeerNodeId`
|
|
- `SourceNodeId`
|
|
- `ConfirmedWall`
|
|
- `ConfirmedLogic`
|
|
- `ConfirmedHash`
|
|
- `LastConfirmedUtc`
|
|
- `IsActive`
|
|
|
|
### 2. Add store abstraction
|
|
Add `IPeerOplogConfirmationStore` with operations:
|
|
- `EnsurePeerRegisteredAsync(peerNodeId, address, type)`
|
|
- `UpdateConfirmationAsync(peerNodeId, sourceNodeId, timestamp, hash)`
|
|
- `GetConfirmationsAsync()` and `GetConfirmationsForPeerAsync(peerNodeId)`
|
|
- `RemovePeerTrackingAsync(peerNodeId)`
|
|
- `GetActiveTrackedPeersAsync()`
|
|
|
|
### 3. BLite implementation
|
|
- Add entity, mapper, and indexed collection to `CBDDCDocumentDbContext`.
|
|
- Index strategy:
|
|
- unique `(PeerNodeId, SourceNodeId)`
|
|
- index `IsActive`
|
|
- index `(SourceNodeId, ConfirmedWall, ConfirmedLogic)` for cutoff scans
|
|
|
|
### 4. Snapshot support
|
|
Include peer-confirmation state in snapshot export/import/merge so pruning safety state survives backup/restore.
|
|
|
|
## Sync and Pruning Behavior Plan
|
|
### 5. Auto-register peers when discovered
|
|
On each orchestrator loop, before sync attempts:
|
|
- collect merged peer list (discovered + known peers)
|
|
- call `EnsurePeerRegisteredAsync(...)` for each peer
|
|
- skip local node
|
|
|
|
### 6. Advance confirmation watermarks
|
|
During sync session with a peer:
|
|
- after vector-clock exchange, advance watermark for nodes where remote is already ahead/equal
|
|
- after successful push batch, advance watermark to max pushed timestamp/hash per source node
|
|
- persist updates atomically per peer when possible
|
|
|
|
### 7. Gate oplog pruning by peer confirmation
|
|
Replace retention-only prune trigger with a safe cutoff computation:
|
|
- compute retention cutoff from `OplogRetentionHours` (existing behavior)
|
|
- compute confirmation cutoff as the minimum confirmed point across active tracked peers
|
|
- effective cutoff = minimum(retention cutoff, confirmation cutoff)
|
|
- prune only to effective cutoff
|
|
|
|
If any active tracked peer has no confirmation for relevant source nodes, do not prune those ranges.
|
|
|
|
### 8. Deprecated peer removal path
|
|
Provide explicit management operation to unblock pruning for decommissioned peers:
|
|
- add method in management service (example: `RemovePeerTrackingAsync(nodeId, removeRemoteConfig = true)`)
|
|
- remove from confirmation tracking store
|
|
- optionally remove static peer configuration
|
|
- document operator workflow for node deprecation
|
|
|
|
## Hosting Health Check Plan
|
|
### 9. Extend hosting health check payload
|
|
Update Hosting `CBDDCHealthCheck` to include peer sync status data:
|
|
- tracked peer count
|
|
- peers with no confirmation
|
|
- max lag (ms) between local head and peer confirmation
|
|
- lagging peer list (node IDs)
|
|
- last successful confirmation update per peer
|
|
|
|
### 10. Health status policy
|
|
- `Healthy`: persistence OK and all active tracked peers within lag threshold
|
|
- `Degraded`: persistence OK but one or more peers lagging/unconfirmed
|
|
- `Unhealthy`: persistence unavailable, or critical lag breach (configurable)
|
|
|
|
Add configurable thresholds in hosting options/cluster options.
|
|
|
|
## Implementation Phases
|
|
### Phase 1: Persistence and contracts
|
|
- Add model + store interface + BLite implementation + DI wiring.
|
|
- Add tests for CRUD, idempotent register, and explicit remove.
|
|
|
|
### Phase 2: Sync integration
|
|
- Register peers from discovery.
|
|
- Update confirmations from vector clock + push success.
|
|
- Add sync tests validating watermark advancement.
|
|
|
|
### Phase 3: Safe pruning
|
|
- Implement cutoff calculator service.
|
|
- Integrate with orchestrator maintenance path.
|
|
- Add two-node tests proving no prune before peer confirmation.
|
|
|
|
### Phase 4: Management and health
|
|
- Expose remove-tracking operation in peer management API.
|
|
- Extend hosting healthcheck output and status policy.
|
|
- Add hosting healthcheck tests for Healthy/Degraded/Unhealthy.
|
|
|
|
### Phase 5: Docs and rollout
|
|
- Update docs for peer lifecycle and pruning semantics.
|
|
- Add upgrade notes and operational runbook for peer deprecation.
|
|
|
|
## Safe Subagent Usage
|
|
- Use subagents only for isolated, low-coupling tasks with clear file ownership boundaries.
|
|
- Assign each subagent a narrow scope (one component or one test suite at a time).
|
|
- Require explicit task contracts for each subagent including input files/components, expected output, and forbidden operations.
|
|
- Prohibit destructive repository actions by subagents (`reset --hard`, force-push, history rewrite, broad file deletion).
|
|
- Require subagents to report what changed, why, and which tests were run.
|
|
- Do not accept subagent-authored changes directly into final output without primary-agent review.
|
|
|
|
## Mandatory Verification After Subagent Work
|
|
- Enforce a verification gate after every subagent-delivered change before integration.
|
|
- Verification gate checklist: independently review the produced diff; run targeted unit/integration tests for touched behavior; validate impacted acceptance criteria; confirm no regressions in pruning safety, peer lifecycle handling, or healthcheck output.
|
|
- Reject or rework any subagent output that fails verification.
|
|
- Only merge/integrate subagent output after verification evidence is documented.
|
|
|
|
## Test Plan (Minimum)
|
|
- Unit:
|
|
- confirmation store upsert/get/remove behavior
|
|
- auto-register is idempotent
|
|
- safe cutoff computation with mixed peer states
|
|
- removing a peer from tracking immediately changes cutoff eligibility
|
|
- healthcheck classification rules
|
|
|
|
- Integration (two-node focus):
|
|
- Node B offline: Node A does not prune confirmed-required range
|
|
- Node B catches up: Node A prunes once confirmed
|
|
- New node join auto-registers without manual call
|
|
- Deprecated node removal unblocks pruning
|
|
|
|
## Risks and Mitigations
|
|
- Risk: indefinite growth if a peer never confirms.
|
|
- Mitigation: explicit removal workflow and degraded health visibility.
|
|
|
|
- Risk: confirmation drift after restore/restart.
|
|
- Mitigation: snapshot persistence of confirmation records.
|
|
|
|
- Risk: mixed-version cluster behavior.
|
|
- Mitigation: rely on existing vector clock exchange first; keep protocol additions backward compatible if introduced later.
|
|
|
|
## Acceptance Criteria
|
|
- Oplog entries are not pruned while any active tracked peer has not confirmed required ranges.
|
|
- Newly discovered peers are automatically present in tracking without operator action.
|
|
- Operators can explicitly remove a deprecated peer from tracking and pruning resumes accordingly.
|
|
- Hosting health endpoint exposes peer sync lag/confirmation status and returns degraded/unhealthy when appropriate.
|