Files
natsnet/docs/plans/2026-02-27-batch-18-server-core-design.md
Joseph Doherty dc3e162608 Add batch plans for batches 13-15, 18-22 (rounds 8-11)
Generated design docs and implementation plans via Codex for:
- Batch 13: FileStore Read/Query
- Batch 14: FileStore Write/Lifecycle
- Batch 15: MsgBlock + ConsumerFileStore
- Batch 18: Server Core
- Batch 19: Accounts Core
- Batch 20: Accounts Resolvers
- Batch 21: Events + MsgTrace
- Batch 22: Monitoring

All plans include mandatory verification protocol and anti-stub guardrails.
Updated batches.md with file paths and planned status.
2026-02-27 15:43:14 -05:00

146 lines
6.4 KiB
Markdown

# Batch 18 (Server Core) Design
**Date:** 2026-02-27
**Scope:** Design only for Batch 18 (`server/server.go`): 10 features and 8 tests.
## Context Snapshot
Batch metadata from PortTracker (`batch show 18`):
- Batch ID: `18`
- Name: `Server Core`
- Dependencies: batches `4`, `16`
- Go file focus: `golang/nats-server/server/server.go`
- Features: `10` (all currently `deferred`)
- Tests: `8` (all currently `deferred`)
Feature IDs in scope:
- `2982` `s2WriterOptions`
- `2987` `Server.logRejectedTLSConns`
- `3048` `Server.fetchAccount`
- `3066` `Server.getMonitoringTLSConfig`
- `3068` `Server.HTTPHandler`
- `3078` `tlsTimeout`
- `3088` `Server.numRemotes`
- `3112` `Server.readyForConnections`
- `3118` `Server.String`
- `3119` `setGoRoutineLabels`
Test IDs in scope:
- `2111` `TestMonitorHandler`
- `2167` `BenchmarkXMQTT`
- `2382` `TestNoRaceAccountAddServiceImportRace`
- `2467` `TestNoRaceRoutePerAccount`
- `2468` `TestNoRaceRoutePerAccountSubWithWildcard`
- `2481` `TestNoRaceWSNoCorruptionWithFrameSizeLimit`
- `2819` `TestRouteIPResolutionAndRouteToSelf`
- `2897` `TestInsecureSkipVerifyWarning`
Repository progress snapshot (`report summary`):
- Features: `1271 verified`, `2377 deferred`, `24 n_a`, `1 stub`
- Unit tests: `430 verified`, `2640 deferred`, `187 n_a`
- Overall: `1924/6942 (27.7%)`
## Problem Statement
Batch 18 is a narrow but high-leverage seam in `NatsServer`: readiness gates, monitoring handler exposure, TLS timeout/monitor TLS helper behavior, account fetch path naming/parity, and goroutine labeling parity with Go.
The highest risk is false progress from placeholder tests and name-mismatch implementations:
1. Several mapped methods are partially covered by equivalent but differently named C# methods, so Roslyn audit can still keep IDs `deferred`.
2. Existing `ImplBacklog` test files contain low-signal placeholders (string/constant assertions) that can pass without exercising server behavior.
3. Batch 18 tests include benchmark and norace wrappers that may need explicit `deferred` or `n_a` handling instead of fake ports.
## Approaches
### Approach A: Minimal audit-alignment wrappers only
Add exact method names/signatures expected by mappings and delegate to existing logic.
- Pros: fastest path to audit alignment for deferred features.
- Cons: can miss behavior gaps (for example monitor handler lifecycle and readiness semantics).
### Approach B (Recommended): Behavior-first parity by helper cluster
Implement exact mapped methods and close semantic gaps in two feature clusters, then port/triage the 8 tests with strict anti-stub evidence.
- Pros: balances audit alignment and real behavior parity; lower regression risk.
- Cons: slightly more effort than wrappers-only.
### Approach C: Test-first only, defer most features
Start with Batch 18 tests and defer remaining features when they do not immediately block tests.
- Pros: fast test activity.
- Cons: poor feature closure and likely repeated defer/audit churn.
## Recommended Design
### 1. Architecture and File Boundaries
Keep all Batch 18 work in existing `NatsServer` partials and mapped backlog test classes:
- Feature implementation files:
- `dotnet/src/ZB.MOM.NatsNet.Server/NatsServer.Init.cs`
- `dotnet/src/ZB.MOM.NatsNet.Server/NatsServer.Accounts.cs`
- `dotnet/src/ZB.MOM.NatsNet.Server/NatsServer.Listeners.cs`
- `dotnet/src/ZB.MOM.NatsNet.Server/NatsServer.Lifecycle.cs`
- `dotnet/src/ZB.MOM.NatsNet.Server/NatsServer.cs` (only if a shared type/field is required)
- Test files:
- `dotnet/tests/ZB.MOM.NatsNet.Server.Tests/ImplBacklog/MonitoringHandlerTests.Impltests.cs`
- `dotnet/tests/ZB.MOM.NatsNet.Server.Tests/ImplBacklog/MqttExternalTests.Impltests.cs`
- `dotnet/tests/ZB.MOM.NatsNet.Server.Tests/ImplBacklog/ConcurrencyTests1.Impltests.cs`
- `dotnet/tests/ZB.MOM.NatsNet.Server.Tests/ImplBacklog/ConcurrencyTests2.Impltests.cs`
- `dotnet/tests/ZB.MOM.NatsNet.Server.Tests/ImplBacklog/RouteHandlerTests.Impltests.cs`
- `dotnet/tests/ZB.MOM.NatsNet.Server.Tests/ImplBacklog/NatsServerTests.Impltests.cs`
### 2. Feature Grouping Strategy (<=20 per group)
- **Group A (6 features):** `2982,2987,3066,3068,3078,3119`
- Compression writer options helper, TLS rejection log loop parity, monitoring TLS callback, HTTP handler getter lifecycle, TLS handshake timeout helper, goroutine labels helper.
- **Group B (4 features):** `3048,3088,3112,3118`
- Account fetch parity helper, `numRemotes` internal parity, readiness check parity, `String()` parity.
### 3. Test Porting/Triage Strategy
- **Test Group T1 (behavioral, target for verify):** `2111,2819,2897`
- Must execute real server behavior (monitor handler lifecycle, route self-resolution logging path, insecure TLS warning path).
- **Test Group T2 (non-unit or race wrappers):** `2167,2382,2467,2468,2481`
- Evaluate for direct xUnit viability.
- If not viable without benchmark tooling/race harness/live infra, keep `deferred` (or set `n_a` when clearly benchmark-only) with explicit reason and zero stubs.
### 4. Verification Design (adapted from Batch 0 protocol)
Execution must use hard gates after each feature/test group:
1. Per-feature loop: read Go function -> implement C# -> build -> run related tests.
2. Stub scans over touched feature and test files.
3. Build/test gates before any status promotion.
4. Status updates in evidence-backed chunks (`<=15` IDs).
5. Task checkpoints with full build + full unit test + commit before next task.
### 5. Risks and Mitigations
1. **Mapped-name vs existing-name mismatch**
Mitigation: add exact mapped methods (or wrappers) and keep behavior in one canonical implementation.
2. **Shallow placeholder tests**
Mitigation: enforce anti-stub guardrails that reject literal-only assertions and no-act/no-behavior tests.
3. **Benchmark/race test portability limits**
Mitigation: explicitly classify as `deferred`/`n_a` with concrete blocker text instead of fake-pass tests.
## Success Criteria
1. All 10 Batch 18 features are `verified` or intentionally `deferred/n_a` with evidence and reasons.
2. All 8 Batch 18 tests are `verified` or intentionally `deferred/n_a` with evidence and reasons.
3. No forbidden stub patterns are introduced in touched feature/test files.
4. Batch updates are auditable and bounded (<=15 IDs per update command).
## Non-Goals
1. No implementation in this design document.
2. No broad refactoring outside Batch 18 server-core scope.
3. No fake benchmark/race replacements that produce green tests without behavioral coverage.