Files
CBDD/docs/features/vector-search.md

61 lines
1.8 KiB
Markdown

# Vector Search
## Purpose And Business Outcome
Enable similarity search for embedding-driven workloads directly in embedded storage.
## Scope And Non-Goals
Scope:
- Vector index configuration
- Approximate nearest-neighbor query execution
Non-goals:
- External model training
- Cross-database vector federation
## User And System Workflows
1. Consumer registers vector index for embedding field.
2. Documents persist embeddings in collection payloads.
3. Query issues vector search request with `k` nearest neighbors.
4. Engine returns ranked matches.
## Interfaces And APIs
- Vector index configuration via model builder
- Query extensions under `VectorSearchExtensions`
- Index implementation in `VectorSearchIndex`
## Permissions And Data Handling
- Embeddings may contain sensitive semantic information.
- Apply host-level access restrictions and retention controls.
## Dependencies And Failure Modes
Dependencies:
- Correct embedding dimensionality
- Index parameter tuning for workload
Failure modes:
- Dimension mismatch between data and query vectors
- Poor recall due to incorrect index configuration
## Monitoring, Alerts, And Troubleshooting
- Validate vector query quality during release smoke checks.
- Use [`../runbook.md`](../runbook.md) for incident handling.
- Follow [`../security.md`](../security.md) for embedding-data handling controls.
- Use [`../troubleshooting.md`](../troubleshooting.md#query-and-index-issues) for vector query remediation.
## Rollout And Change Considerations
- Treat vector index parameter changes as performance-sensitive releases.
- Document compatibility impact for existing persisted indexes.
## Validation Guidance
- Run vector search tests in `tests/CBDD.Tests/VectorSearchTests.cs`.
- Add benchmark runs for large-vector workloads before release.