Files
CBDDC/twonode.md

7.8 KiB

Plan: Peer-Confirmed Oplog Retention and Peer Sync Health

Objective

Move from time-only oplog pruning to peer-confirmed pruning so entries are not removed until peers have confirmed them, while also adding peer lifecycle management and health visibility.

Requested Outcomes

  • Oplogs are not cleared until confirmed by peers.
  • Each node tracks latest oplog confirmation per peer.
  • New peers are automatically registered when they join.
  • Deprecated peers can be explicitly removed from tracking.
  • Hosting health checks report peer sync status, not only store availability.

Current Baseline (Codebase)

  • Pruning is retention-time based in SyncOrchestrator + IOplogStore.PruneOplogAsync(...).
  • Push ACK is binary success/fail (AckResponse) and does not persist peer confirmation state.
  • Peer discovery exists (IDiscoveryService, UdpDiscoveryService, CompositeDiscoveryService).
  • Persistent remote peer config exists (IPeerConfigurationStore, PeerManagementService).
  • Hosting health check only validates oplog access (CBDDCHealthCheck in Hosting project).

Design Decisions

  • Track confirmation as a persisted watermark per (peerNodeId, sourceNodeId) using HLC and hash.
  • Use existing vector-clock exchange and successful push results to advance confirmations (no mandatory wire protocol break required).
  • Treat tracked peers as pruning blockers until explicitly removed.
  • Keep peer registration idempotent and safe for repeated discovery events.

Data Model and Persistence Plan

1. Add peer confirmation tracking model

Create a new persisted model (example name: PeerOplogConfirmation) with fields:

  • PeerNodeId
  • SourceNodeId
  • ConfirmedWall
  • ConfirmedLogic
  • ConfirmedHash
  • LastConfirmedUtc
  • IsActive

2. Add store abstraction

Add IPeerOplogConfirmationStore with operations:

  • EnsurePeerRegisteredAsync(peerNodeId, address, type)
  • UpdateConfirmationAsync(peerNodeId, sourceNodeId, timestamp, hash)
  • GetConfirmationsAsync() and GetConfirmationsForPeerAsync(peerNodeId)
  • RemovePeerTrackingAsync(peerNodeId)
  • GetActiveTrackedPeersAsync()

3. BLite implementation

  • Add entity, mapper, and indexed collection to CBDDCDocumentDbContext.
  • Index strategy:
  • unique (PeerNodeId, SourceNodeId)
  • index IsActive
  • index (SourceNodeId, ConfirmedWall, ConfirmedLogic) for cutoff scans

4. Snapshot support

Include peer-confirmation state in snapshot export/import/merge so pruning safety state survives backup/restore.

Sync and Pruning Behavior Plan

5. Auto-register peers when discovered

On each orchestrator loop, before sync attempts:

  • collect merged peer list (discovered + known peers)
  • call EnsurePeerRegisteredAsync(...) for each peer
  • skip local node

6. Advance confirmation watermarks

During sync session with a peer:

  • after vector-clock exchange, advance watermark for nodes where remote is already ahead/equal
  • after successful push batch, advance watermark to max pushed timestamp/hash per source node
  • persist updates atomically per peer when possible

7. Gate oplog pruning by peer confirmation

Replace retention-only prune trigger with a safe cutoff computation:

  • compute retention cutoff from OplogRetentionHours (existing behavior)
  • compute confirmation cutoff as the minimum confirmed point across active tracked peers
  • effective cutoff = minimum(retention cutoff, confirmation cutoff)
  • prune only to effective cutoff

If any active tracked peer has no confirmation for relevant source nodes, do not prune those ranges.

8. Deprecated peer removal path

Provide explicit management operation to unblock pruning for decommissioned peers:

  • add method in management service (example: RemovePeerTrackingAsync(nodeId, removeRemoteConfig = true))
  • remove from confirmation tracking store
  • optionally remove static peer configuration
  • document operator workflow for node deprecation

Hosting Health Check Plan

9. Extend hosting health check payload

Update Hosting CBDDCHealthCheck to include peer sync status data:

  • tracked peer count
  • peers with no confirmation
  • max lag (ms) between local head and peer confirmation
  • lagging peer list (node IDs)
  • last successful confirmation update per peer

10. Health status policy

  • Healthy: persistence OK and all active tracked peers within lag threshold
  • Degraded: persistence OK but one or more peers lagging/unconfirmed
  • Unhealthy: persistence unavailable, or critical lag breach (configurable)

Add configurable thresholds in hosting options/cluster options.

Implementation Phases

Phase 1: Persistence and contracts

  • Add model + store interface + BLite implementation + DI wiring.
  • Add tests for CRUD, idempotent register, and explicit remove.

Phase 2: Sync integration

  • Register peers from discovery.
  • Update confirmations from vector clock + push success.
  • Add sync tests validating watermark advancement.

Phase 3: Safe pruning

  • Implement cutoff calculator service.
  • Integrate with orchestrator maintenance path.
  • Add two-node tests proving no prune before peer confirmation.

Phase 4: Management and health

  • Expose remove-tracking operation in peer management API.
  • Extend hosting healthcheck output and status policy.
  • Add hosting healthcheck tests for Healthy/Degraded/Unhealthy.

Phase 5: Docs and rollout

  • Update docs for peer lifecycle and pruning semantics.
  • Add upgrade notes and operational runbook for peer deprecation.

Safe Subagent Usage

  • Use subagents only for isolated, low-coupling tasks with clear file ownership boundaries.
  • Assign each subagent a narrow scope (one component or one test suite at a time).
  • Require explicit task contracts for each subagent including input files/components, expected output, and forbidden operations.
  • Prohibit destructive repository actions by subagents (reset --hard, force-push, history rewrite, broad file deletion).
  • Require subagents to report what changed, why, and which tests were run.
  • Do not accept subagent-authored changes directly into final output without primary-agent review.

Mandatory Verification After Subagent Work

  • Enforce a verification gate after every subagent-delivered change before integration.
  • Verification gate checklist: independently review the produced diff; run targeted unit/integration tests for touched behavior; validate impacted acceptance criteria; confirm no regressions in pruning safety, peer lifecycle handling, or healthcheck output.
  • Reject or rework any subagent output that fails verification.
  • Only merge/integrate subagent output after verification evidence is documented.

Test Plan (Minimum)

  • Unit:

  • confirmation store upsert/get/remove behavior

  • auto-register is idempotent

  • safe cutoff computation with mixed peer states

  • removing a peer from tracking immediately changes cutoff eligibility

  • healthcheck classification rules

  • Integration (two-node focus):

  • Node B offline: Node A does not prune confirmed-required range

  • Node B catches up: Node A prunes once confirmed

  • New node join auto-registers without manual call

  • Deprecated node removal unblocks pruning

Risks and Mitigations

  • Risk: indefinite growth if a peer never confirms.

  • Mitigation: explicit removal workflow and degraded health visibility.

  • Risk: confirmation drift after restore/restart.

  • Mitigation: snapshot persistence of confirmation records.

  • Risk: mixed-version cluster behavior.

  • Mitigation: rely on existing vector clock exchange first; keep protocol additions backward compatible if introduced later.

Acceptance Criteria

  • Oplog entries are not pruned while any active tracked peer has not confirmed required ranges.
  • Newly discovered peers are automatically present in tracking without operator action.
  • Operators can explicitly remove a deprecated peer from tracking and pruning resumes accordingly.
  • Hosting health endpoint exposes peer sync lag/confirmation status and returns degraded/unhealthy when appropriate.