natsnet/docs/plans/2026-02-27-batch-40-mqtt-server-jsa-design.md

# Batch 40 MQTT Server/JSA Design

**Date:** 2026-02-27
**Batch:** 40 (`MQTT Server/JSA`)
**Scope:** 78 features + 323 unit tests
**Dependencies:** Batches `19` (Accounts Core), `27` (JetStream Core)
**Go source:** `golang/nats-server/server/mqtt.go` (Batch scope through line ~3500)

## Problem

Batch 40 is the core MQTT server-side control plane and JetStream adapter layer:

- MQTT listener/bootstrap and option validation
- MQTT parser/trace/header helpers
- Per-account session manager and retained-message handling
- MQTT JetStream request/reply adapter methods
- MQTT session persistence and packet identifier tracking

If this batch is advanced with placeholder implementations or placeholder tests, PortTracker can look healthy while MQTT behavior remains non-functional.

## Context Findings

### Required command outputs

Executed with explicit dotnet path:

- `/usr/local/share/dotnet/dotnet run --project tools/NatsNet.PortTracker -- batch show 40 --db porting.db`
  - Status: `pending`
  - Features: `78` (all `deferred`)
  - Tests: `323` (all `deferred`)
  - Depends on: `19,27`
  - Go file: `server/mqtt.go`
- `/usr/local/share/dotnet/dotnet run --project tools/NatsNet.PortTracker -- batch list --db porting.db`
  - Confirms dependency chain includes `19 -> 40` and `27 -> 40`
- `/usr/local/share/dotnet/dotnet run --project tools/NatsNet.PortTracker -- report summary --db porting.db`
  - Overall snapshot: `1924/6942 (27.7%)`

Environment note: `dotnet` is not on `PATH` in this shell; use `/usr/local/share/dotnet/dotnet`.

### Current .NET baseline

- MQTT types/constants exist in:
  - `dotnet/src/ZB.MOM.NatsNet.Server/Mqtt/MqttConstants.cs`
  - `dotnet/src/ZB.MOM.NatsNet.Server/Mqtt/MqttTypes.cs`
- Core behavior is mostly stubs:
  - `dotnet/src/ZB.MOM.NatsNet.Server/Mqtt/MqttHandler.cs`
- Batch 40 mapped features target classes not fully represented yet (notably `MqttJetStreamAdapter`).
- Batch 40 mapped tests are mostly placeholder-template tests in `ImplBacklog` files with non-behavioral assertions.

## Approach Options

### Approach A: Keep all MQTT behavior in one file (`MqttHandler.cs`)

- Pros: minimal file churn.
- Cons: difficult review boundaries, high merge risk, and weak checkpoint isolation for 78 features.

### Approach B: Port tests first, then features

- Pros: strict red/green pressure.
- Cons: too many missing APIs and placeholder methods produce noisy red phase with low signal.

### Approach C (Recommended): Domain-sliced feature groups + staged test waves

- Pros: aligns with Go method clusters, enables per-group verification and status updates, and supports strict anti-stub gates.
- Cons: requires planned decomposition across several files and explicit mapping alignment.

**Decision:** Approach C.

## Proposed Design

### 1. Code organization and ownership

Implement Batch 40 in MQTT-focused partial/domain files so each group has clear ownership:

- `NatsServer` MQTT entrypoints (start/config/session manager bridge)
- `ClientConnection` MQTT parser/client-id/trace helpers
- `MqttJetStreamAdapter` request/reply and stream/consumer/message operations
- `MqttAccountSessionManager` retained/session orchestration
- `MqttSession` lifecycle/persistence/pending publish tracking
- Stateless helpers for retained-message encoding/decoding and header parsing

This keeps behavioral slices reviewable while preserving .NET naming and namespace rules.

### 2. Feature grouping strategy (<= 20 per group)

- Group A (20): MQTT bootstrap/parser/auth and request-builder foundations
- Group B (20): JSA operations + JS API reply processing bridge
- Group C (20): session-manager retained/subscription core
- Group D (18): retained encode/decode + session lifecycle tracking

Each group is independently verifiable and status-updated in controlled ID chunks.

### 3. Test strategy for 323 mapped tests

Use wave-based execution by test class risk profile:

- MQTT-focused and deterministic first (`MqttHandlerTests`, helper-centric classes)
- then consumer/engine deterministic slices
- cluster/supercluster/benchmark/norace-heavy tests last, with explicit deferral when runtime infra is required

Rule: deferred with concrete blocker reason is valid; fake pass/stub is not.

### 4. Mapping alignment

Because mapped class names include `MqttJetStreamAdapter` and method names using Go-style labels, include an early mapping-alignment checkpoint:

- either implement mapped classes/methods directly,
- or add well-documented wrappers/adapters that preserve mapped method discoverability.

No status promotion happens before this alignment compiles and has test coverage.

### 5. Verification model (design-level)

Batch 40 execution must be evidence-driven and include:

- per-feature verification loop
- per-test verification loop
- stub detection scan before any promotion
- build gate and targeted/full test gates
- checkpoint protocol between tasks
- status updates with hard limit of 15 IDs per command

## Risks and Mitigations

- Risk: large count of integration/cluster tests may encourage placeholder conversions.
  Mitigation: mandatory anti-stub patterns and forced deferred-with-reason fallback.
- Risk: mapping/class drift between tracker metadata and code structure.
  Mitigation: explicit mapping-alignment task before major implementation.
- Risk: broad behavioral surface across parser/session/JSA code paths.
  Mitigation: 4 feature groups with mandatory checkpoint gates after each task.

## Non-Goals

- Executing Batch 40 in this planning session.
- Re-scoping IDs outside Batch 40.
- Mass-promoting statuses without per-item evidence.