Files

Joseph Doherty dc3e162608 Add batch plans for batches 13-15, 18-22 (rounds 8-11)

Generated design docs and implementation plans via Codex for:
- Batch 13: FileStore Read/Query
- Batch 14: FileStore Write/Lifecycle
- Batch 15: MsgBlock + ConsumerFileStore
- Batch 18: Server Core
- Batch 19: Accounts Core
- Batch 20: Accounts Resolvers
- Batch 21: Events + MsgTrace
- Batch 22: Monitoring

All plans include mandatory verification protocol and anti-stub guardrails.
Updated batches.md with file paths and planned status.

2026-02-27 15:43:14 -05:00

5.5 KiB

Raw Blame History

Batch 21 Events + MsgTrace Design

Date: 2026-02-27
Batch: 21 (Events + MsgTrace)
Scope: Design only. No implementation in this document.

Problem

Batch 21 contains high-fanout eventing and tracing behavior used by system account messaging, server stats, admin requests, and distributed message tracing.

Current tracker scope from batch show 21:

Features: 118 (all deferred)
Tests: 9 (all deferred)
Dependencies: batches 18, 19
Go sources: server/events.go, server/msgtrace.go

Context Findings

Existing .NET code already has many event and trace DTO/types in:
- dotnet/src/ZB.MOM.NatsNet.Server/Events/EventTypes.cs
- dotnet/src/ZB.MOM.NatsNet.Server/MessageTrace/MsgTraceTypes.cs
Runtime behavior methods mapped in Batch 21 are largely missing from NatsServer and ClientConnection.
Existing backlog tests for Events/MsgTrace include placeholder-style assertions and must be replaced with behavior-valid tests for mapped Batch 21 IDs.
test show mapping confirms target class/methods for all 9 tests:
- Events: 6 methods in EventsHandlerTests
- MsgTrace: 2 methods in MessageTracerTests
- Concurrency: 1 method in ConcurrencyTests1

Assumptions

Batch 21 work begins only after dependencies (batches 18 and 19) are complete enough to compile and run related tests.
We preserve existing project structure and naming patterns (NatsServer.*.cs, ClientConnection.*.cs, ImplBacklog/*.Impltests.cs).
No new integration infrastructure is introduced in this batch; infra-blocked items remain deferred with explicit reasons.

Approaches

Approach A: Monolithic implementation in existing large files

Implement all methods directly in NatsServer.cs, ClientConnection.cs, ClientTypes.cs, and NatsServerTypes.cs.

Trade-offs:

Pros: Minimal file creation.
Cons: Very high merge conflict risk, poor reviewability, difficult verification per feature group.

Approach B (Recommended): Partial-file segmentation by runtime domain

Add focused partial/runtime files for events and msgtrace behavior while leaving DTO/type files intact.

Trade-offs:

Pros: Enables clear feature-group boundaries, easier per-group build/test loops, lower risk of accidental regressions.
Cons: Requires a few new files and up-front structure decisions.

Approach C: Test-only first, then backfill features

Start by rewriting the 9 tests to force implementation behavior.

Trade-offs:

Pros: Fast feedback for mapped tests.
Cons: Batch has 118 features and only 9 mapped tests; test-first alone leaves large behavior surface unvalidated.

Recommended Design

Use Approach B.

Code organization

Implement runtime behavior in small, targeted files:

dotnet/src/ZB.MOM.NatsNet.Server/NatsServer.Events.cs (core eventing/send loops)
dotnet/src/ZB.MOM.NatsNet.Server/NatsServer.Events.System.cs (system subscriptions/requests)
dotnet/src/ZB.MOM.NatsNet.Server/NatsServer.Events.Admin.cs (reload/kick/ldm/debug/OCSP event paths)
dotnet/src/ZB.MOM.NatsNet.Server/ClientConnection.MsgTrace.cs (client-side trace enable/init helpers)
dotnet/src/ZB.MOM.NatsNet.Server/MessageTrace/MsgTraceRuntime.cs (runtime trace mutation/send behavior)
plus targeted edits in:
- dotnet/src/ZB.MOM.NatsNet.Server/NatsServerTypes.cs (ServerInfo capability helpers)
- dotnet/src/ZB.MOM.NatsNet.Server/ClientTypes.cs (ClientInfo helper projections)
- dotnet/src/ZB.MOM.NatsNet.Server/Accounts/Account.cs (Account.statz)

Behavior domains

Domain 1: Capability and metadata helpers (ServerInfo, ClientInfo, hashing helpers).
Domain 2: Internal system message send/receive loops and event state lifecycle.
Domain 3: Remote server/account tracking and statsz/advisory publication.
Domain 4: System subscription wiring and request handlers (connsz/statsz/idz/nsubs/reload/kick/ldm).
Domain 5: OCSP advisory events and misc utility wrappers.
Domain 6: Message trace runtime (trace enablement, header extraction, event aggregation, publish path).

Test design

Replace mapped placeholder tests with behavior checks that assert:

System subscription registration and unsubscribe behavior.
Connection update timer/sweep behavior under local/remote account changes.
Remote latency update validation and bad payload handling.
MsgTrace connection-name normalization and trace-header parsing correctness.
No-race JetStream compact scenario behavior for mapped test ID 2412.

Execution model

Port features in 6 groups (<=20 IDs each), then tests in 2 waves, with strict per-feature verification and anti-stub gates.

Risks and Mitigations

Risk: Placeholder tests can pass while behavior is wrong.
- Mitigation: Mandatory anti-stub checks and per-test evidence before status updates.
Risk: Large eventing surface can regress unrelated server behavior.
- Mitigation: Build gate after each feature group + full unit-test checkpoint between tasks.
Risk: Some tests require runtime topology not available in unit test harness.
- Mitigation: keep deferred with explicit blocker reason; do not stub.

Success Criteria

All 118 Batch 21 feature IDs moved through stub -> complete -> verified only with build + test evidence.
All 9 Batch 21 test IDs either verified with real assertions or deferred with explicit blocker reason.
No new stubs (NotImplementedException, empty bodies, TODO placeholders) in touched feature/test files.
Batch 21 can be completed with batch complete 21 once all IDs satisfy status requirements.

5.5 KiB Raw Blame History

Batch 21 Events + MsgTrace Design

Problem

Context Findings

Assumptions

Approaches

Approach A: Monolithic implementation in existing large files

Approach B (Recommended): Partial-file segmentation by runtime domain

Approach C: Test-only first, then backfill features

Recommended Design

Code organization

Behavior domains

Test design

Execution model

Risks and Mitigations

Success Criteria

5.5 KiB

Raw Blame History