Generated design docs and implementation plans via Codex for: - Batch 13: FileStore Read/Query - Batch 14: FileStore Write/Lifecycle - Batch 15: MsgBlock + ConsumerFileStore - Batch 18: Server Core - Batch 19: Accounts Core - Batch 20: Accounts Resolvers - Batch 21: Events + MsgTrace - Batch 22: Monitoring All plans include mandatory verification protocol and anti-stub guardrails. Updated batches.md with file paths and planned status.
7.2 KiB
Batch 22 Monitoring Design
Date: 2026-02-27
Batch: 22 (Monitoring)
Scope: Design only. No implementation in this document.
Problem
Batch 22 ports NATS server monitoring behavior from server/monitor.go into .NET. The batch is large and mixed:
- Features:
70(all currentlydeferred) - Tests:
29(all currentlydeferred) - Dependencies: batches
18,19 - Go source:
golang/nats-server/server/monitor.go
This batch includes both core data endpoints (/connz, /routez, /subsz, /varz) and broader operational surfaces (/gatewayz, /leafz, /accountz, /jsz, /healthz, /raftz, /debug/vars, profiling).
Context Findings
From tracker and codebase inspection:
- Batch metadata confirmed with:
dotnet run --project tools/NatsNet.PortTracker -- batch show 22 --db porting.dbdotnet run --project tools/NatsNet.PortTracker -- batch list --db porting.dbdotnet run --project tools/NatsNet.PortTracker -- report summary --db porting.db
- .NET currently has monitoring constants and partial DTOs (
MonitorTypes.cs,MonitorSortOptions.cs, monitor paths inNatsServer.Listeners.cs), but most mapped runtime methods are not yet implemented. - Existing mapped test files are mostly placeholder-style in
ImplBacklogand need behavioral rewrites for the 29 mapped test IDs. - Batch 22 tests span multiple classes (
MonitoringHandlerTests,RouteHandlerTests,LeafNodeHandlerTests,AccountTests,EventsHandlerTests,JetStreamJwtTests,ConfigReloaderTests), so verification cannot be isolated to one test class.
Constraints and Success Criteria
- Must preserve Go behavior semantics while writing idiomatic .NET 10 C#.
- Must follow project standards (
xUnit 3,Shouldly,NSubstitute; no FluentAssertions/Moq). - Must avoid stubs and fake-pass tests.
- Feature status can move to
verifiedonly after related test gate is green. - Group work in chunks no larger than ~20 features.
Success looks like:
- 70 features implemented and verified (or explicitly deferred with reason where truly blocked).
- 29 mapped tests verified with real Arrange/Act/Assert behavior, not placeholders.
- Batch can be closed with
batch complete 22once statuses satisfy PortTracker rules.
Approaches
Approach A: Single-file monitoring implementation
Implement all monitoring behavior in one or two large files (for example, NatsServer.Monitoring.cs + one test file wave).
Trade-offs:
- Pros: fewer new files.
- Cons: poor reviewability, high merge risk, difficult to verify incrementally, very high chance of hidden stubs in large diff.
Approach B (Recommended): Domain-segmented partials and DTO blocks
Split monitoring into focused runtime domains with dedicated partial files and matching test waves.
Trade-offs:
- Pros: matches the natural endpoint domains in
monitor.go, enables strong per-group build/test gating, easier status evidence collection. - Cons: adds several files, requires deliberate file map upfront.
Approach C: Test-first across all 29 tests before feature work
Rewrite all 29 tests first, then implement features until all pass.
Trade-offs:
- Pros: very fast signal on regressions.
- Cons: test set under-represents some large feature surfaces (
healthz,raftz, gateway/account internals), so feature quality still needs per-feature validation loops.
Recommended Design
Use Approach B.
Architecture
Implement monitoring in six domain slices, each with a bounded feature group:
- Connz core + query decoders + connz handler
- Routez/Subsz/Stacksz/IPQueuesz
- Varz/root/runtime/config helpers
- Gatewayz + Leafz + AccountStatz + response helpers + closed-state rendering
- Accountz + JSz account/detail + Jsz endpoint
- Healthz + expvarz/profilez + raftz
Each slice follows the same loop: port features -> build -> run related tests -> stub scan -> status updates.
Proposed File Map
Primary production files to create/modify:
- Create/Modify:
dotnet/src/ZB.MOM.NatsNet.Server/NatsServer.Monitoring.Connz.cs - Create/Modify:
dotnet/src/ZB.MOM.NatsNet.Server/NatsServer.Monitoring.RouteSub.cs - Create/Modify:
dotnet/src/ZB.MOM.NatsNet.Server/NatsServer.Monitoring.Varz.cs - Create/Modify:
dotnet/src/ZB.MOM.NatsNet.Server/NatsServer.Monitoring.GatewayLeaf.cs - Create/Modify:
dotnet/src/ZB.MOM.NatsNet.Server/NatsServer.Monitoring.AccountJsz.cs - Create/Modify:
dotnet/src/ZB.MOM.NatsNet.Server/NatsServer.Monitoring.HealthRaft.cs - Modify:
dotnet/src/ZB.MOM.NatsNet.Server/Monitor/MonitorTypes.cs - Modify:
dotnet/src/ZB.MOM.NatsNet.Server/Monitor/MonitorSortOptions.cs - Modify:
dotnet/src/ZB.MOM.NatsNet.Server/ClientTypes.cs(forClosedState.Stringparity) - Modify (if needed for DTO placement):
dotnet/src/ZB.MOM.NatsNet.Server/Routes/RouteTypes.csdotnet/src/ZB.MOM.NatsNet.Server/Gateway/GatewayTypes.csdotnet/src/ZB.MOM.NatsNet.Server/LeafNode/LeafNodeTypes.cs
Primary mapped test files:
dotnet/tests/ZB.MOM.NatsNet.Server.Tests/ImplBacklog/MonitoringHandlerTests.Impltests.csdotnet/tests/ZB.MOM.NatsNet.Server.Tests/ImplBacklog/RouteHandlerTests.Impltests.csdotnet/tests/ZB.MOM.NatsNet.Server.Tests/ImplBacklog/LeafNodeHandlerTests.Impltests.csdotnet/tests/ZB.MOM.NatsNet.Server.Tests/ImplBacklog/AccountTests.Impltests.csdotnet/tests/ZB.MOM.NatsNet.Server.Tests/ImplBacklog/EventsHandlerTests.Impltests.csdotnet/tests/ZB.MOM.NatsNet.Server.Tests/ImplBacklog/JetStreamJwtTests.Impltests.csdotnet/tests/ZB.MOM.NatsNet.Server.Tests/ImplBacklog/ConfigReloaderTests.Impltests.cs
Data Flow
- HTTP monitor request -> query decode/validation -> domain endpoint function (
Connz,Routez,Varz, etc.) -> DTO response projection -> unified response writer. - Endpoint functions read server/account/client state under required locks and project immutable response objects.
- Sort and pagination apply after candidate gathering, matching Go behavior by endpoint.
Error Handling Strategy
- Invalid query params return bad request through shared response helper.
- Unsupported combinations (for example sort options not valid for state) return explicit errors.
- Infra-unavailable behavior in tests remains deferred with explicit reason instead of placeholder implementations.
Testing Strategy
- Rewrite only the mapped 29 test IDs as behavior-valid tests, class by class.
- Each feature group uses targeted test filters tied to that domain.
- Keep full unit test checkpoint between tasks to catch regressions outside monitor-specific tests.
Risks and Mitigations
- Risk: fake tests pass while behavior is unimplemented.
- Mitigation: explicit anti-stub scans for placeholder signatures and literal-only assertions.
- Risk: large healthz/raftz surfaces with sparse mapped tests.
- Mitigation: per-feature read/port/build loop plus grouped sanity tests and status evidence requirements.
- Risk: lock-sensitive endpoint logic causes race regressions.
- Mitigation: keep route/leaf/account race tests in the required per-group gate.
Design Approval Basis
This design is based on the explicit user-provided constraints (planning-only, mandatory guardrails, group size limits, and required tracker commands) and is ready for implementation planning with writeplan.