301 lines
17 KiB
Markdown
301 lines
17 KiB
Markdown
# SubList Allocation Reduction Implementation Plan
|
|
|
|
> **For Codex:** REQUIRED SUB-SKILLS: Use `using-git-worktrees` to create an isolated worktree before making changes, `executeplan` to implement this plan task-by-task, and `finishing-a-development-branch` to merge the verified work back to `main` when implementation is complete.
|
|
|
|
**Goal:** Reduce publish-path allocation and lookup overhead in `SubList` by removing composite string keys, minimizing `token.ToString()` churn, and tightening `Match()` result building without changing subscription semantics.
|
|
|
|
**Architecture:** First lock the current trie, cache, and remote-interest behavior with targeted tests. Then replace routed-sub bookkeeping with a dedicated value key, remove string split/rebuild work from remote cleanup, and finally optimize trie traversal and match result construction with span-friendly helpers and pooled builders.
|
|
|
|
**Tech Stack:** .NET 10, C#, `ReaderWriterLockSlim`, span-based token parsing, xUnit, existing clustering/gateway parity suites, benchmark test harness.
|
|
|
|
---
|
|
|
|
## Scope Anchors
|
|
- Primary source: `src/NATS.Server/Subscriptions/SubList.cs`
|
|
- Supporting source: `src/NATS.Server/Subscriptions/SubjectMatch.cs`
|
|
- Existing core tests: `tests/NATS.Server.Core.Tests/Subscriptions/SubListGoParityTests.cs`
|
|
- Existing ctor/notification tests: `tests/NATS.Server.Core.Tests/Subscriptions/SubListCtorAndNotificationParityTests.cs`
|
|
- Route cleanup tests: `tests/NATS.Server.Clustering.Tests/Routes/RouteRemoteSubCleanupParityBatch2Tests.cs`
|
|
- Gateway/route interest tests:
|
|
- `tests/NATS.Server.Gateways.Tests/Gateways/GatewayInterestModeTests.cs`
|
|
- `tests/NATS.Server.Clustering.Tests/Routes/RouteInterestIdempotencyTests.cs`
|
|
- `tests/NATS.Server.Clustering.Tests/Routes/RouteSubscriptionTests.cs`
|
|
- Documentation to update: `Documentation/Subscriptions/SubList.md`
|
|
- Benchmark workflow reference: `tests/NATS.Server.Benchmark.Tests/README.md`
|
|
- Benchmark comparison document: `benchmarks_comparison.md`
|
|
|
|
## Task 0: Create an isolated git worktree for the SubList optimization work
|
|
|
|
**Files:**
|
|
- Verify: `.worktrees/`
|
|
- Modify if needed: `.gitignore`
|
|
|
|
**Step 1: Verify the preferred worktree directory is available and ignored**
|
|
- Check that `.worktrees/` exists and is ignored by git before creating a project-local worktree.
|
|
- If `.worktrees/` is not ignored, add it to `.gitignore`, then commit that repository hygiene fix before continuing.
|
|
|
|
**Step 2: Create the feature worktree on a `codex/` branch**
|
|
- Run:
|
|
```bash
|
|
git worktree add .worktrees/codex-sublist-allocation-reduction -b codex/sublist-allocation-reduction
|
|
cd .worktrees/codex-sublist-allocation-reduction
|
|
```
|
|
- Expected: a new isolated worktree exists at `.worktrees/codex-sublist-allocation-reduction` on branch `codex/sublist-allocation-reduction`.
|
|
|
|
**Step 3: Verify the worktree starts from a clean, passing baseline**
|
|
- Run: `dotnet test tests/NATS.Server.Core.Tests/NATS.Server.Core.Tests.csproj --filter "FullyQualifiedName~SubList" -c Release`
|
|
- Run: `dotnet test tests/NATS.Server.Clustering.Tests/NATS.Server.Clustering.Tests.csproj --filter "FullyQualifiedName~RouteRemoteSubCleanupParityBatch2Tests|FullyQualifiedName~RouteInterestIdempotencyTests|FullyQualifiedName~RouteSubscriptionTests" -c Release`
|
|
- Run: `dotnet test tests/NATS.Server.Gateways.Tests/NATS.Server.Gateways.Tests.csproj --filter "FullyQualifiedName~GatewayInterestModeTests|FullyQualifiedName~GatewayInterestIdempotencyTests|FullyQualifiedName~GatewayForwardingTests" -c Release`
|
|
- Expected: PASS. If this fails, stop and resolve or explicitly confirm whether to proceed from a failing baseline.
|
|
|
|
**Step 4: Commit any required worktree setup fix**
|
|
- Only if `.gitignore` changed, run:
|
|
```bash
|
|
git add .gitignore
|
|
git commit -m "chore: ignore local worktree directories"
|
|
```
|
|
|
|
## Task 1: Lock behavior around remote interest, cleanup, and matching
|
|
|
|
**Files:**
|
|
- Modify: `tests/NATS.Server.Core.Tests/Subscriptions/SubListGoParityTests.cs`
|
|
- Modify: `tests/NATS.Server.Clustering.Tests/Routes/RouteRemoteSubCleanupParityBatch2Tests.cs`
|
|
- Create: `tests/NATS.Server.Core.Tests/Subscriptions/SubListAllocationGuardTests.cs`
|
|
|
|
**Step 1: Add failing tests for routed-sub bookkeeping changes**
|
|
- Cover:
|
|
- applying the same remote subscription twice
|
|
- removing remote subscriptions by route and by route/account
|
|
- queue-weight updates
|
|
- exact and wildcard remote-interest queries
|
|
|
|
**Step 2: Add tests for match result stability**
|
|
- Ensure `Match()` still returns correct plain and queue subscription sets for exact, `*`, and `>` subjects.
|
|
- Add a test that specifically locks cache behavior across generation bumps.
|
|
|
|
**Step 3: Run the focused tests to prove the new coverage fails first**
|
|
- Run: `dotnet test tests/NATS.Server.Core.Tests/NATS.Server.Core.Tests.csproj --filter "FullyQualifiedName~SubList" -c Release`
|
|
- Run: `dotnet test tests/NATS.Server.Clustering.Tests/NATS.Server.Clustering.Tests.csproj --filter "FullyQualifiedName~RouteRemoteSubCleanupParityBatch2Tests|FullyQualifiedName~RouteInterestIdempotencyTests" -c Release`
|
|
- Expected: FAIL only in the newly added allocation-guard or key-behavior tests.
|
|
|
|
**Step 4: Commit the failing-test baseline**
|
|
- Run:
|
|
```bash
|
|
git add tests/NATS.Server.Core.Tests/Subscriptions/SubListGoParityTests.cs tests/NATS.Server.Core.Tests/Subscriptions/SubListAllocationGuardTests.cs tests/NATS.Server.Clustering.Tests/Routes/RouteRemoteSubCleanupParityBatch2Tests.cs
|
|
git commit -m "test: lock SubList remote-key and match behavior"
|
|
```
|
|
|
|
## Task 2: Replace composite routed-sub strings with a dedicated value key
|
|
|
|
**Files:**
|
|
- Create: `src/NATS.Server/Subscriptions/RoutedSubKey.cs`
|
|
- Modify: `src/NATS.Server/Subscriptions/SubList.cs`
|
|
- Modify: `tests/NATS.Server.Clustering.Tests/Routes/RouteRemoteSubCleanupParityBatch2Tests.cs`
|
|
|
|
**Step 1: Introduce a strongly typed routed-sub key**
|
|
- Add a small immutable value type for `(RouteId, Account, Subject, Queue)`.
|
|
- Use it as the dictionary key for `_remoteSubs` instead of the `"route|account|subject|queue"` composite string.
|
|
|
|
**Step 2: Remove string split/rebuild helpers from hot paths**
|
|
- Replace `BuildRoutedSubKey(...)`, `GetAccNameFromRoutedSubKey(...)`, and `GetRoutedSubKeyInfo(...)` usage in runtime paths with the new value key.
|
|
- Keep compatibility helper coverage only if other call sites still require string-facing helpers temporarily.
|
|
|
|
**Step 3: Run remote-interest tests**
|
|
- Run: `dotnet test tests/NATS.Server.Clustering.Tests/NATS.Server.Clustering.Tests.csproj --filter "FullyQualifiedName~RouteRemoteSubCleanupParityBatch2Tests|FullyQualifiedName~RouteSubscriptionTests" -c Release`
|
|
- Run: `dotnet test tests/NATS.Server.Gateways.Tests/NATS.Server.Gateways.Tests.csproj --filter "FullyQualifiedName~GatewayInterestModeTests|FullyQualifiedName~GatewayInterestIdempotencyTests" -c Release`
|
|
- Expected: PASS.
|
|
|
|
**Step 4: Commit the key-model refactor**
|
|
- Run:
|
|
```bash
|
|
git add src/NATS.Server/Subscriptions/RoutedSubKey.cs src/NATS.Server/Subscriptions/SubList.cs tests/NATS.Server.Clustering.Tests/Routes/RouteRemoteSubCleanupParityBatch2Tests.cs
|
|
git commit -m "perf: replace SubList routed-sub string keys"
|
|
```
|
|
|
|
## Task 3: Remove avoidable string churn from trie traversal
|
|
|
|
**Files:**
|
|
- Modify: `src/NATS.Server/Subscriptions/SubList.cs`
|
|
- Modify: `src/NATS.Server/Subscriptions/SubjectMatch.cs`
|
|
- Modify: `tests/NATS.Server.Core.Tests/Subscriptions/SubListGoParityTests.cs`
|
|
|
|
**Step 1: Rework token traversal helpers**
|
|
- Add a shared subject token walker that can expose tokens as spans and only allocate when a trie node insertion truly needs a durable string key.
|
|
- Remove repeated `token.ToString()` in traversal paths where lookups can operate on a transient token view first.
|
|
|
|
**Step 2: Keep exact-subject match paths allocation-lean**
|
|
- Prefer span/token comparison helpers for exact-match and wildcard traversal logic.
|
|
- Leave wildcard semantics unchanged.
|
|
|
|
**Step 3: Run core subscription tests**
|
|
- Run: `dotnet test tests/NATS.Server.Core.Tests/NATS.Server.Core.Tests.csproj --filter "FullyQualifiedName~SubListGoParityTests|FullyQualifiedName~SubListCtorAndNotificationParityTests|FullyQualifiedName~SubListParityBatch2Tests" -c Release`
|
|
- Expected: PASS.
|
|
|
|
**Step 4: Commit trie traversal cleanup**
|
|
- Run:
|
|
```bash
|
|
git add src/NATS.Server/Subscriptions/SubList.cs src/NATS.Server/Subscriptions/SubjectMatch.cs tests/NATS.Server.Core.Tests/Subscriptions/SubListGoParityTests.cs
|
|
git commit -m "perf: reduce SubList token string churn"
|
|
```
|
|
|
|
## Task 4: Pool `Match()` result building and remove cleanup copies
|
|
|
|
**Files:**
|
|
- Modify: `src/NATS.Server/Subscriptions/SubList.cs`
|
|
- Modify: `tests/NATS.Server.Core.Tests/Subscriptions/SubListAllocationGuardTests.cs`
|
|
- Modify: `tests/NATS.Server.Clustering.Tests/Routes/RouteSubscriptionTests.cs`
|
|
|
|
**Step 1: Replace temporary per-match collections**
|
|
- Rework `Match()` so temporary result building uses pooled builders or `ArrayBufferWriter<T>` instead of fresh nested `List<T>` allocations on every call.
|
|
- Preserve the current public `SubListResult` shape unless profiling proves a larger contract change is justified.
|
|
|
|
**Step 2: Remove `ToArray()` cleanup passes over `_remoteSubs`**
|
|
- Update `RemoveRemoteSubs(...)` and `RemoveRemoteSubsForAccount(...)` to avoid eager dictionary array copies.
|
|
- Ensure removal remains correct under the existing lock discipline.
|
|
|
|
**Step 3: Run cross-module regression**
|
|
- Run: `dotnet test tests/NATS.Server.Clustering.Tests/NATS.Server.Clustering.Tests.csproj --filter "FullyQualifiedName~RouteSubscriptionTests|FullyQualifiedName~RouteInterestIdempotencyTests" -c Release`
|
|
- Run: `dotnet test tests/NATS.Server.Gateways.Tests/NATS.Server.Gateways.Tests.csproj --filter "FullyQualifiedName~GatewayInterestModeTests|FullyQualifiedName~GatewayForwardingTests" -c Release`
|
|
- Expected: PASS.
|
|
|
|
**Step 4: Commit match-builder changes**
|
|
- Run:
|
|
```bash
|
|
git add src/NATS.Server/Subscriptions/SubList.cs tests/NATS.Server.Core.Tests/Subscriptions/SubListAllocationGuardTests.cs tests/NATS.Server.Clustering.Tests/Routes/RouteSubscriptionTests.cs
|
|
git commit -m "perf: pool SubList match builders and cleanup scans"
|
|
```
|
|
|
|
## Task 5: Add benchmark coverage, update docs, and run full verification
|
|
|
|
**Files:**
|
|
- Create: `tests/NATS.Server.Benchmark.Tests/CorePubSub/SubListMatchBenchmarks.cs`
|
|
- Modify: `Documentation/Subscriptions/SubList.md`
|
|
|
|
**Step 1: Add focused `SubList` benchmarks**
|
|
- Measure exact-match, wildcard-match, queue-sub, and remote-interest scenarios.
|
|
- Capture throughput and allocations before/after the refactor.
|
|
|
|
**Step 2: Update subscription documentation**
|
|
- Document the new routed-sub key model, the allocation strategy for trie matching, and any remaining intentional copies.
|
|
|
|
**Step 3: Run full verification**
|
|
- Run: `dotnet test tests/NATS.Server.Core.Tests/NATS.Server.Core.Tests.csproj -c Release`
|
|
- Run: `dotnet test tests/NATS.Server.Clustering.Tests/NATS.Server.Clustering.Tests.csproj -c Release`
|
|
- Run: `dotnet test tests/NATS.Server.Gateways.Tests/NATS.Server.Gateways.Tests.csproj -c Release`
|
|
- Run: `dotnet test tests/NATS.Server.Benchmark.Tests/NATS.Server.Benchmark.Tests.csproj --filter "FullyQualifiedName~SubList" -c Release`
|
|
- Expected: PASS; benchmark output shows reduced per-match allocations.
|
|
|
|
**Step 4: Commit the documentation and benchmark work**
|
|
- Run:
|
|
```bash
|
|
git add tests/NATS.Server.Benchmark.Tests/CorePubSub/SubListMatchBenchmarks.cs Documentation/Subscriptions/SubList.md
|
|
git commit -m "docs: record SubList allocation strategy"
|
|
```
|
|
|
|
## Task 6: Merge the verified SubList work back to `main` and clean up the worktree
|
|
|
|
**Files:**
|
|
- No source changes expected
|
|
|
|
**Step 1: Confirm the feature branch is fully verified before merge**
|
|
- Reuse the verification from Task 5. Do not merge if the core, clustering, gateway, or benchmark test commands are failing.
|
|
|
|
**Step 2: Merge `codex/sublist-allocation-reduction` back into `main`**
|
|
- Return to the primary repository worktree and run:
|
|
```bash
|
|
git checkout main
|
|
git pull
|
|
git merge codex/sublist-allocation-reduction
|
|
```
|
|
- Expected: the SubList optimization commits merge cleanly into `main`.
|
|
|
|
**Step 3: Re-run verification on the merged `main` branch**
|
|
- Run: `dotnet test tests/NATS.Server.Core.Tests/NATS.Server.Core.Tests.csproj --filter "FullyQualifiedName~SubList" -c Release`
|
|
- Run: `dotnet test tests/NATS.Server.Clustering.Tests/NATS.Server.Clustering.Tests.csproj --filter "FullyQualifiedName~RouteRemoteSubCleanupParityBatch2Tests|FullyQualifiedName~RouteInterestIdempotencyTests|FullyQualifiedName~RouteSubscriptionTests" -c Release`
|
|
- Run: `dotnet test tests/NATS.Server.Gateways.Tests/NATS.Server.Gateways.Tests.csproj --filter "FullyQualifiedName~GatewayInterestModeTests|FullyQualifiedName~GatewayInterestIdempotencyTests|FullyQualifiedName~GatewayForwardingTests" -c Release`
|
|
- Run: `dotnet test tests/NATS.Server.Benchmark.Tests/NATS.Server.Benchmark.Tests.csproj --filter "FullyQualifiedName~SubList" -c Release`
|
|
- Expected: PASS on merged `main`.
|
|
|
|
**Step 4: Delete the feature branch and remove the worktree**
|
|
- Run:
|
|
```bash
|
|
git branch -d codex/sublist-allocation-reduction
|
|
git worktree remove .worktrees/codex-sublist-allocation-reduction
|
|
```
|
|
- Expected: only `main` remains checked out in the primary workspace, and the temporary SubList worktree is removed.
|
|
|
|
## Task 7: Run the full benchmark suite per the benchmark README and update `benchmarks_comparison.md`
|
|
|
|
**Files:**
|
|
- Verify workflow against: `tests/NATS.Server.Benchmark.Tests/README.md`
|
|
- Modify: `benchmarks_comparison.md`
|
|
|
|
**Step 1: Run the benchmark test project using the repository-documented command**
|
|
- From the primary repository worktree on verified `main`, run:
|
|
```bash
|
|
dotnet test tests/NATS.Server.Benchmark.Tests \
|
|
--filter "Category=Benchmark" \
|
|
-v normal \
|
|
--logger "console;verbosity=detailed" 2>&1 | tee /tmp/bench-output.txt
|
|
```
|
|
- Expected: the full benchmark suite completes and writes detailed comparison output to `/tmp/bench-output.txt`.
|
|
|
|
**Step 2: Extract the benchmark comparison blocks from the captured output**
|
|
- Open `/tmp/bench-output.txt` and pull the side-by-side comparison blocks from the `Standard Output Messages` sections for:
|
|
- core pub-only
|
|
- core 1:1 pub/sub
|
|
- core fan-out
|
|
- core multi pub/sub
|
|
- request/reply
|
|
- JetStream publish
|
|
- JetStream consumption
|
|
|
|
**Step 3: Update `benchmarks_comparison.md`**
|
|
- Refresh:
|
|
- the benchmark run date on the first line
|
|
- the environment description if toolchain or machine details changed
|
|
- throughput, MB/s, ratio, and latency values in the benchmark tables
|
|
- summary assessments and key observations if the ratios materially changed
|
|
- Keep the narrative tied to measured results from `/tmp/bench-output.txt`; do not preserve stale claims that no longer match the numbers.
|
|
|
|
**Step 4: Commit the benchmark comparison refresh**
|
|
- Run:
|
|
```bash
|
|
git add benchmarks_comparison.md
|
|
git commit -m "docs: refresh benchmark comparison after SubList optimization"
|
|
```
|
|
- Expected: `main` contains the merged SubList optimization work plus the refreshed benchmark comparison document.
|
|
|
|
## Completion Checklist
|
|
- [ ] SubList optimization work was implemented in an isolated `.worktrees/codex-sublist-allocation-reduction` worktree on branch `codex/sublist-allocation-reduction`.
|
|
- [ ] `_remoteSubs` no longer uses composite string keys in runtime paths.
|
|
- [ ] Remote cleanup paths no longer depend on `Split('|')` and `_remoteSubs.ToArray()`.
|
|
- [ ] Trie traversal materially reduces `token.ToString()` churn.
|
|
- [ ] `Match()` uses pooled or allocation-lean temporary builders.
|
|
- [ ] Core, clustering, and gateway parity tests remain green.
|
|
- [ ] `Documentation/Subscriptions/SubList.md` explains the new key and match strategy.
|
|
- [ ] Verified SubList optimization commits were merged back into `main`.
|
|
- [ ] The full benchmark test project was run on merged `main` per `tests/NATS.Server.Benchmark.Tests/README.md`.
|
|
- [ ] `benchmarks_comparison.md` was updated to match the latest benchmark output.
|
|
|
|
## Concise Execution Checklist (Current Codebase)
|
|
- [ ] Start from the current repo root and leave unrelated MQTT worktree changes untouched.
|
|
- [ ] Create `.worktrees/codex-sublist-allocation-reduction` on `codex/sublist-allocation-reduction`.
|
|
- [ ] Verify the current SubList baseline with:
|
|
- `tests/NATS.Server.Core.Tests/Subscriptions/SubListGoParityTests.cs`
|
|
- `tests/NATS.Server.Core.Tests/Subscriptions/SubListCtorAndNotificationParityTests.cs`
|
|
- `tests/NATS.Server.Core.Tests/Subscriptions/SubListParityBatch2Tests.cs`
|
|
- `tests/NATS.Server.Clustering.Tests/Routes/RouteRemoteSubCleanupParityBatch2Tests.cs`
|
|
- `tests/NATS.Server.Clustering.Tests/Routes/RouteInterestIdempotencyTests.cs`
|
|
- `tests/NATS.Server.Clustering.Tests/Routes/RouteSubscriptionTests.cs`
|
|
- `tests/NATS.Server.Gateways.Tests/Gateways/GatewayInterestModeTests.cs`
|
|
- `tests/NATS.Server.Gateways.Tests/Gateways/GatewayForwardingTests.cs`
|
|
- [ ] Add new guard coverage in `tests/NATS.Server.Core.Tests/Subscriptions/SubListAllocationGuardTests.cs`.
|
|
- [ ] Refactor `src/NATS.Server/Subscriptions/SubList.cs` to use a new `src/NATS.Server/Subscriptions/RoutedSubKey.cs`.
|
|
- [ ] Reduce token string churn in `src/NATS.Server/Subscriptions/SubList.cs` and `src/NATS.Server/Subscriptions/SubjectMatch.cs`.
|
|
- [ ] Add a new benchmark file at `tests/NATS.Server.Benchmark.Tests/CorePubSub/SubListMatchBenchmarks.cs` beside the existing CorePubSub benchmarks.
|
|
- [ ] Update `Documentation/Subscriptions/SubList.md`.
|
|
- [ ] Run full verification for core, clustering, gateway, and focused SubList benchmark tests.
|
|
- [ ] Merge `codex/sublist-allocation-reduction` into `main` and re-run the merged verification.
|
|
- [ ] Run the full benchmark suite using `tests/NATS.Server.Benchmark.Tests/README.md` guidance and refresh `benchmarks_comparison.md`.
|