Generated design docs and implementation plans via Codex for: - Batch 23: Routes - Batch 24: Leaf Nodes - Batch 25: Gateways - Batch 26: WebSocket - Batch 27: JetStream Core - Batch 28: JetStream API - Batch 29: JetStream Batching - Batch 30: Raft Part 1 All plans include mandatory verification protocol and anti-stub guardrails. Updated batches.md with file paths and planned status.
9.0 KiB
Batch 25 Gateways Design
Date: 2026-02-27
Batch: 25 (Gateways)
Scope: Design only. No implementation in this document.
Problem
Batch 25 ports gateway connection handling from golang/nats-server/server/gateway.go into the .NET server.
- Features:
86(all currentlydeferred) - Tests:
59(all currentlydeferred) - Dependencies: batches
19and23 - Go source:
server/gateway.go - Batch status:
pending
This batch is one of the highest fan-in networking areas: it touches server startup, inbound/outbound connection lifecycle, route gossip, account interest propagation, and reply-subject mapping.
Context Findings
Collected with:
/usr/local/share/dotnet/dotnet run --project tools/NatsNet.PortTracker -- batch show 25 --db porting.db/usr/local/share/dotnet/dotnet run --project tools/NatsNet.PortTracker -- batch list --db porting.db/usr/local/share/dotnet/dotnet run --project tools/NatsNet.PortTracker -- report summary --db porting.db/usr/local/share/dotnet/dotnet run --project tools/NatsNet.PortTracker -- batch ready --db porting.db
Repository observations:
dotnet/src/ZB.MOM.NatsNet.Server/Gateway/GatewayTypes.csalready contains gateway data types, but method behavior is mostly unimplemented and includes at least one explicit TODO (GwReplyMapping.Get).NatsServer.Init.csstill comments gateway initialization as deferred (s.NewGateway(opts)).NatsServer.Lifecycle.csincludes gateway placeholders (GetAllGatewayConnections,RemoveRemoteGatewayConnection,GatewayUpdateSubInterest) that need full parity behavior.ClientConnection.csdoes not yet contain gateway protocol/message handlers (ProcessGatewayConnect,ProcessGatewayInfo,ProcessGatewayRSub,SendMsgToGateways, etc.).GatewayHandlerTests.Impltests.cscurrently has many placeholder-style tests (for example assertions against literal"Should"strings), so Batch 25 test work must include anti-stub cleanup.batch readycurrently does not list Batch 25 as ready, so implementation must not start until dependencies are complete.
Constraints and Success Criteria
Constraints:
- Follow
docs/standards/dotnet-standards.md(.NET 10, nullable enabled, xUnit 3 + Shouldly + NSubstitute). - Keep behavior equivalent to Go intent in
gateway.go; do not line-by-line transliterate goroutine patterns. - No stubs/fake-pass tests/status promotion without evidence.
- Execute in feature groups of at most ~20 features.
- Dependency order is strict: Batch 25 execution only starts when batches 19 and 23 are complete.
Success criteria:
- All 86 features are either implemented and verified, or explicitly deferred with concrete blocker reasons.
- All 59 mapped tests are real behavioral tests and pass.
- Gateway regression paths remain green across server/account/route/leaf/monitoring touchpoints.
- Batch 25 can be completed in PortTracker without audit override due to placeholder code.
Approaches
Approach A: Single-file gateway implementation
Place nearly all gateway methods into one large server file and keep client changes in ClientConnection.cs.
Trade-offs:
- Pros: fewer files.
- Cons: very high review risk, poor parallelism, hard evidence tracking, difficult rollback.
Approach B (Recommended): Domain-segmented partials by gateway lifecycle
Split implementation into focused partials:
- Config/bootstrap/listener and outbound solicitation
- Handshake/INFO/gossip and connection registry
- Interest/subscription propagation
- Reply-map and inbound/outbound message processing
Trade-offs:
- Pros: maps directly to
gateway.gosections, clean feature grouping, tighter verification loops, lower merge risk. - Cons: more files and coordination.
Approach C: Test-first backlog replacement then backfill feature methods
Replace all gateway-related placeholder tests first, then implement production behavior until tests pass.
Trade-offs:
- Pros: immediate feedback.
- Cons: current mapped tests underrepresent some feature paths unless supplemented with targeted regression gates.
Recommended Design
Use Approach B with five feature groups (19/18/17/16/16) and four test waves (15/16/18/10) tied to gateway.go line ranges and mapped test IDs.
Architecture
NatsServerpartials own gateway server state transitions: initialization, accept loop, solicitation/reconnect, remote registration, route gossip, account-level interest operations, and reply-map lifecycle.ClientConnectionpartials own gateway protocol operations: CONNECT/INFO handling, RSub/RUnsub/account commands, inbound gateway message pipeline, and outbound gateway send decisions.Gatewayhelper surface owns deterministic utilities: hash/prefix helpers, routed-reply parsing, string/proto builders, and gateway option validation.- Existing
GatewayTypesremains the source of gateway state objects but receives method implementations and lock-safe helpers.
Proposed File Map
Primary production files:
- Modify:
dotnet/src/ZB.MOM.NatsNet.Server/Gateway/GatewayTypes.cs - Create:
dotnet/src/ZB.MOM.NatsNet.Server/Gateway/GatewayHandler.cs - Create:
dotnet/src/ZB.MOM.NatsNet.Server/NatsServer.Gateways.ConfigAndStartup.cs - Create:
dotnet/src/ZB.MOM.NatsNet.Server/NatsServer.Gateways.ConnectionsAndGossip.cs - Create:
dotnet/src/ZB.MOM.NatsNet.Server/NatsServer.Gateways.Interest.cs - Create:
dotnet/src/ZB.MOM.NatsNet.Server/NatsServer.Gateways.ReplyMap.cs - Create:
dotnet/src/ZB.MOM.NatsNet.Server/ClientConnection.Gateways.Protocol.cs - Create:
dotnet/src/ZB.MOM.NatsNet.Server/ClientConnection.Gateways.Messages.cs - Modify:
dotnet/src/ZB.MOM.NatsNet.Server/NatsServer.Init.cs - Modify:
dotnet/src/ZB.MOM.NatsNet.Server/NatsServer.Lifecycle.cs - Modify (as needed):
dotnet/src/ZB.MOM.NatsNet.Server/Protocol/ProtocolParser.cs
Mapped test files:
- Modify:
dotnet/tests/ZB.MOM.NatsNet.Server.Tests/ImplBacklog/GatewayHandlerTests.Impltests.cs - Modify:
dotnet/tests/ZB.MOM.NatsNet.Server.Tests/ImplBacklog/LeafNodeHandlerTests.Impltests.cs - Modify:
dotnet/tests/ZB.MOM.NatsNet.Server.Tests/ImplBacklog/MonitoringHandlerTests.Impltests.cs - Modify:
dotnet/tests/ZB.MOM.NatsNet.Server.Tests/ImplBacklog/JetStreamEngineTests.Impltests.cs - Create/Modify:
dotnet/tests/ZB.MOM.NatsNet.Server.Tests/ImplBacklog/JetStreamSuperClusterTests.Impltests.cs - Modify:
dotnet/tests/ZB.MOM.NatsNet.Server.Tests/ImplBacklog/ConcurrencyTests1.Impltests.cs - Modify:
dotnet/tests/ZB.MOM.NatsNet.Server.Tests/ImplBacklog/ConcurrencyTests2.Impltests.cs - Modify:
dotnet/tests/ZB.MOM.NatsNet.Server.Tests/ImplBacklog/ConfigReloaderTests.Impltests.cs - Modify:
dotnet/tests/ZB.MOM.NatsNet.Server.Tests/ImplBacklog/NatsServerTests.Impltests.cs
Data and Control Flow
- Startup flow: validate options -> initialize gateway state (
NewGateway) -> start accept loop -> solicit remotes after configured delay. - Handshake flow: outbound connection receives INFO -> emits CONNECT -> registers outbound; inbound accepts CONNECT -> validates/authorizes -> registers inbound.
- Gossip flow: new gateway URLs and cluster metadata propagate to routes and inbound gateways.
- Interest flow: account/subscription changes update per-account gateway state and trigger RS+/RS-/A+/A- commands.
- Message flow: publish routing checks local interest, gateway interest mode, queue weights, and reply mapping before fanout.
- Reply flow:
_GR_/legacy reply prefixes are tracked, mapped, and expired through periodic cleanup.
Error Handling Strategy
- Preserve Go behavior for protocol errors (
wrong gateway, malformed gateway commands, invalid account commands). - Keep explicit guard clauses and structured log messages around malformed INFO/CONNECT and URL validation failures.
- On transient dial failures use reconnect/backoff paths; on unrecoverable config violations fail fast.
- For blocked items, defer with concrete reason instead of placeholder logic.
Verification Strategy
- Use per-feature and per-test loops (read Go source, implement, build, run targeted tests).
- Enforce mandatory stub scans for both production and test files after each group.
- Require build gate after each feature group and full gateway-related test gate before
verified. - Enforce checkpoint protocol between tasks: full build + full unit test sweep + commit.
Risks and Mitigations
- Risk: race regressions in gateway interest maps and reply map.
- Mitigation: include no-race mapped tests (
2376,2490) in mandatory verification wave.
- Mitigation: include no-race mapped tests (
- Risk: mismatch between route gossip and gateway URL updates.
- Mitigation: include monitoring and reload mapped tests (
2127,2131,2747) in verification gate.
- Mitigation: include monitoring and reload mapped tests (
- Risk: placeholder test drift in
GatewayHandlerTests.- Mitigation: anti-stub guardrails with explicit forbidden patterns and evidence-based status updates.
Design Approval Basis
This design is based on Batch 25 tracker metadata, current repository state, and the mandatory verification/anti-stub guardrail model from docs/plans/2026-02-27-batch-0-implementable-tests-plan.md, adapted for both features and tests.