Files
natsnet/docs/plans/2026-02-27-batch-41-mqtt-client-io-design.md
Joseph Doherty 8a126c4932 Add batch plans for batches 37-41 (rounds 19-21)
Generated design docs and implementation plans via Codex for:
- Batch 37: Stream Messages
- Batch 38: Consumer Lifecycle
- Batch 39: Consumer Dispatch
- Batch 40: MQTT Server/JSA
- Batch 41: MQTT Client/IO

All plans include mandatory verification protocol and anti-stub guardrails.
Updated batches.md with file paths and planned status.
All 42 batches (0-41) now have design docs and implementation plans.
2026-02-27 17:27:51 -05:00

6.1 KiB

Batch 41 MQTT Client/IO Design

Date: 2026-02-27
Batch: 41 (MQTT Client/IO)
Scope: 74 features + 28 unit tests
Dependency: Batch 40 (MQTT Server/JSA)
Go source: golang/nats-server/server/mqtt.go (line ~3503 through ~5882)

Problem

Batch 41 is the MQTT client protocol and I/O execution surface:

  • CONNECT/PUBLISH/SUBSCRIBE/UNSUBSCRIBE parse and response paths
  • QoS1/QoS2 flow control and PUBREL lifecycle
  • retain handling and retained-message permission checks
  • topic/filter <-> subject conversion logic
  • byte-level reader/writer utilities used across MQTT packet handling

This batch can easily appear "complete" via placeholders because many mapped tests are currently template-style and several mapped test IDs require non-MQTT infrastructure. The design must force evidence-based completion and explicit deferrals instead of stubs.

Context Findings

Required command outputs

Executed with explicit runtime path because dotnet is not on PATH in this shell:

  • /usr/local/share/dotnet/dotnet run --project tools/NatsNet.PortTracker -- batch show 41 --db porting.db
    • Status: pending
    • Features: 74 (all currently deferred)
    • Tests: 28 (all currently deferred)
    • Depends on: 40
    • Go file: server/mqtt.go
  • /usr/local/share/dotnet/dotnet run --project tools/NatsNet.PortTracker -- batch list --db porting.db
    • Confirms Batch 41 is the final batch and depends only on Batch 40.
  • /usr/local/share/dotnet/dotnet run --project tools/NatsNet.PortTracker -- report summary --db porting.db
    • Current snapshot: 1924/6942 (27.7%).

Current .NET baseline

  • MQTT types/constants exist in:
    • dotnet/src/ZB.MOM.NatsNet.Server/Mqtt/MqttConstants.cs
    • dotnet/src/ZB.MOM.NatsNet.Server/Mqtt/MqttTypes.cs
  • Batch 41 mapped methods are not implemented yet and many target methods/classes are currently absent.
  • MqttHandler.cs still contains broad NotImplemented stubs in server extension methods.
  • MqttHandlerTests.Impltests.cs and related backlog files still include placeholder assertion patterns.
  • Mapped test set includes cross-module classes (AuthCalloutTests, NatsConsumerTests, LeafNodeHandlerTests, WebSocketHandlerTests, JetStreamClusterTests3) in addition to MQTT-specific classes.

Approach Options

Approach A: Single-file incremental port in MqttHandler.cs

  • Pros: minimal file churn.
  • Cons: high merge conflict risk, poor reviewability, weak ownership boundaries for 74 methods.

Approach B: Tests-first for all 28 IDs before feature work

  • Pros: strict red/green discipline.
  • Cons: too many missing method surfaces creates high-noise red phase; many test IDs depend on deferred non-B41 features.
  • Pros: aligns with Go function clusters, keeps each group <=20 features, enables mandatory checkpoint evidence, and handles cross-module tests without fake pass pressure.
  • Cons: requires initial mapping-alignment step and disciplined status batching.

Decision: Approach C.

Proposed Design

1. File and component layout

Implement Batch 41 in focused MQTT slices (new files/partials as needed), not a single monolith:

  • ClientConnection MQTT packet handlers/parsers
    • MqttParseConnect, MqttParsePub, MqttParseSubsOrUnsubs, MqttProcessPublishReceived, enqueue acks
  • NatsServer MQTT publish/session bridge handlers
    • MqttProcessConnect, MqttProcessPub, MqttProcessPubRel, retained permissions audit
  • MqttSession QoS2 consumer/pubrel helpers
    • TrackAsPubRel, UntrackPubRel, DeleteConsumer, ProcessJSConsumer
  • MqttReader / MqttWriter byte I/O utilities
    • varint, length-prefixed strings/bytes, publish header encoding
  • Stateless conversion/trace/helper methods
    • topic/filter conversion, reserve-sub logic, trace formatters, Sparkplug-B helpers

2. Feature grouping model (max ~20 each)

  • Group A (19): session + connect + initial publish path (2331-2349)
  • Group B (19): retained/perms + QoS ack processing + subscribe callbacks (2350-2368)
  • Group C (17): reserved-sub/sparkplug + subscribe processing + unsubscribe/ping (2369-2385)
  • Group D (19): conversion + reader/writer I/O (2386-2404)

3. Test-wave model (28 tests)

  • Wave T1 (10): deterministic MQTT parser/conversion tests (2170,2171,2190,2191,2194,2195,2196,2199,2200,2229)
  • Wave T2 (11): publish/retain/session behavior tests (2182,2204,2234,2235,2236,2237,2238,2246,2251,2253,2285)
  • Wave T3 (4): cross-module integration-touching tests (115,1258,1924,3095)
  • Wave T4 (3): cluster-dependent tests (1055,1056,1113)

4. Deferred strategy by design

Because some mapped tests depend on deferred non-B41 features, completion criteria is:

  • implement and verify what is truly executable in local unit-test context,
  • mark blockers deferred with specific reason and evidence,
  • never substitute with placeholders.

5. Verification architecture

The implementation plan will enforce:

  • per-feature verification loop before promotion,
  • per-test verification loop with discovered/pass evidence,
  • stub detection checks on source and tests,
  • build/test gates before every status update,
  • status updates limited to <=15 IDs per batch-update,
  • mandatory checkpoints between task groups.

Risks and Mitigations

  • Risk: class/method mapping drift (mapped methods not yet present).
    • Mitigation: dedicated mapping-alignment preflight task before feature status changes.
  • Risk: placeholder tests pass without exercising MQTT logic.
    • Mitigation: anti-stub scans + assertion minimum checks + single-test evidence requirement.
  • Risk: cluster/integration tests block throughput.
    • Mitigation: explicit deferred-with-reason path and continue with unblocked items.
  • Risk: large parser/I/O surface causes hidden regressions.
    • Mitigation: incremental group checkpoints with full build + targeted/full test gates.

Non-Goals

  • Executing Batch 41 during this planning session.
  • Marking statuses without verification evidence.
  • Re-scoping features outside Batch 41 except documented test dependency blockers.