68 lines
2.3 KiB
Markdown
68 lines
2.3 KiB
Markdown
# Multi-Dataset Sync
|
|
|
|
## Summary
|
|
|
|
CBDDC can run multiple sync pipelines inside one process by assigning each pipeline a `datasetId` (for example `primary`, `logs`, `timeseries`).
|
|
Each dataset pipeline has independent oplog state, vector-clock reads, peer confirmation watermarks, and maintenance scheduling.
|
|
|
|
## Why Use It
|
|
|
|
- Keep primary business data sync latency stable during high telemetry volume.
|
|
- Isolate append-only streams (`logs`, `timeseries`) from CRUD-heavy collections.
|
|
- Roll out incrementally using runtime flags and per-dataset enablement.
|
|
|
|
## Configuration
|
|
|
|
Register dataset options and enable the runtime coordinator:
|
|
|
|
```csharp
|
|
services.AddCBDDCSurrealEmbedded<SampleDocumentStore>(sp => options)
|
|
.AddCBDDCSurrealEmbeddedDataset("primary", o =>
|
|
{
|
|
o.InterestingCollections = ["Users", "TodoLists"];
|
|
})
|
|
.AddCBDDCSurrealEmbeddedDataset("logs", o =>
|
|
{
|
|
o.InterestingCollections = ["Logs"];
|
|
o.SyncLoopDelay = TimeSpan.FromMilliseconds(500);
|
|
})
|
|
.AddCBDDCSurrealEmbeddedDataset("timeseries", o =>
|
|
{
|
|
o.InterestingCollections = ["Timeseries"];
|
|
o.SyncLoopDelay = TimeSpan.FromMilliseconds(500);
|
|
})
|
|
.AddCBDDCNetwork<StaticPeerNodeConfigurationProvider>();
|
|
|
|
services.AddCBDDCMultiDataset(options =>
|
|
{
|
|
options.EnableMultiDatasetSync = true;
|
|
options.EnableDatasetPrimary = true;
|
|
options.EnableDatasetLogs = true;
|
|
options.EnableDatasetTimeseries = true;
|
|
});
|
|
```
|
|
|
|
## Wire and Storage Compatibility
|
|
|
|
- Protocol messages include optional `dataset_id` fields.
|
|
- Missing `dataset_id` is treated as `primary`.
|
|
- Surreal persistence records include `datasetId`; legacy rows without `datasetId` are read as `primary`.
|
|
|
|
## Operational Notes
|
|
|
|
- Each dataset runs its own `SyncOrchestrator` instance.
|
|
- Maintenance pruning is dataset-scoped (`datasetId` + cutoff).
|
|
- Snapshot APIs support dataset-scoped operations (`CreateSnapshotAsync(stream, datasetId)`).
|
|
|
|
## Migration
|
|
|
|
1. Deploy with `EnableMultiDatasetSync = false`.
|
|
2. Enable multi-dataset mode with only `primary` enabled.
|
|
3. Enable `logs`, verify primary sync SLO.
|
|
4. Enable `timeseries`, verify primary sync SLO again.
|
|
|
|
## Rollback
|
|
|
|
- Set `EnableDatasetLogs = false` and `EnableDatasetTimeseries = false` first.
|
|
- If needed, set `EnableMultiDatasetSync = false` to return to the single `primary` sync path.
|