All checks were successful
CI / verify (push) Successful in 2m33s
Add canonical operations/security/access/feature docs and fix path integrity to improve onboarding and incident readiness.
1.9 KiB
1.9 KiB
Feature: Peer-to-Peer Gossip Sync
Purpose and Business Outcome
Propagate updates across mesh nodes without a central coordinator so local-first applications remain resilient to intermittent connectivity.
Scope and Non-Goals
Scope:
- Peer discovery and sync orchestration.
- Push/pull propagation of oplog changes.
Non-goals:
- Strong global consistency guarantees.
- Public internet exposure without additional network controls.
User and System Workflows
- Node starts and discovers peers.
- Node exchanges sync metadata with connected peers.
- Missing operations are requested and applied.
- Mesh converges over repeated gossip rounds.
Interfaces, APIs, and Events Involved
- Sync orchestrator scheduling
- TCP peer sync channels
- Vector clock exchange and reconciliation
Permissions and Data Handling
- Peers with valid authentication token can exchange replicated collection data.
- Cluster membership should be restricted to trusted nodes.
Dependencies and Failure Modes
Dependencies:
- Network reachability
- Shared authentication material
- Healthy persistence layer
Failure modes:
- Peer isolation due to network outage
- Token mismatch blocking synchronization
- Sustained lag under high write pressure
Monitoring, Alerts, and Troubleshooting Pointers
- Monitor
laggingPeers,maxLagMs, and active peer counts. - Follow Runbook playbooks for lagging or disconnected peers.
Rollout and Change Considerations
- Roll out protocol-affecting changes in a controlled window.
- Confirm backward/forward compatibility in staging mesh before production rollout.
Validation and Testability Guidance
- Run multi-node integration tests with controlled partitions.
- Validate eventual convergence after reconnect.
- Verify no data-loss under repeated reconnect scenarios.