feat(sessions): add bounded replay ring buffer to SessionEventDistributor

This commit is contained in:
Joseph Doherty
2026-06-15 12:42:15 -04:00
parent 7773bdebbd
commit e962737d2c
6 changed files with 375 additions and 3 deletions
+10 -1
View File
@@ -41,7 +41,9 @@ paths, timeouts, queue sizes, enum values, or protocol values are invalid.
},
"Events": {
"QueueCapacity": 10000,
"BackpressurePolicy": "FailFast"
"BackpressurePolicy": "FailFast",
"ReplayBufferCapacity": 1024,
"ReplayRetentionSeconds": 300
},
"Dashboard": {
"Enabled": true,
@@ -135,12 +137,19 @@ ordering and avoids competing consumers.
|--------|---------|-------------|
| `MxGateway:Events:QueueCapacity` | `10000` | Capacity for bounded per-session event queues used by the gateway worker event channel and the public gRPC event stream queue. |
| `MxGateway:Events:BackpressurePolicy` | `FailFast` | Event backpressure behavior. `FailFast` faults the session on public stream queue overflow. `DisconnectSubscriber` disconnects only the slow stream. |
| `MxGateway:Events:ReplayBufferCapacity` | `1024` | Maximum number of events retained per session in the replay ring buffer, used to re-deliver events a returning subscriber missed (reconnect/reattach). The oldest retained event is evicted once this count is exceeded. `0` disables replay retention. |
| `MxGateway:Events:ReplayRetentionSeconds` | `300` | Maximum age, in seconds, of an event retained in the replay ring buffer. Entries older than this are evicted regardless of capacity. `0` disables age-based eviction. |
`QueueCapacity` must be greater than zero. With `FailFast`, queue overflow
faults the affected worker or session instead of silently dropping MXAccess
events. With `DisconnectSubscriber`, public gRPC stream overflow terminates only
the affected stream while the MXAccess session remains active.
`ReplayBufferCapacity` and `ReplayRetentionSeconds` must each be greater than or
equal to zero (either dimension can be disabled with `0`). A returning subscriber
that asks for events older than the oldest still-retained event is told it missed
events (a "gap") and must re-snapshot; whatever is still retained is replayed.
## Dashboard Options
| Option | Default | Description |