Files
natsdotnet/Documentation/GettingStarted/Architecture.md
Joseph Doherty e553db6d40 docs: add Authentication, Clustering, JetStream, Monitoring overviews; update existing docs
New files:
- Documentation/Authentication/Overview.md — all 7 auth mechanisms with real source
  snippets (NKey/JWT/username-password/token/TLS mapping), nonce generation, account
  system, permissions, JWT permission templates
- Documentation/Clustering/Overview.md — route TCP handshake, in-process subscription
  propagation, gateway/leaf node stubs, honest gaps list
- Documentation/JetStream/Overview.md — API surface (4 handled subjects), streams,
  consumers, storage (MemStore/FileStore), in-process RAFT, mirror/source, gaps list
- Documentation/Monitoring/Overview.md — all 12 endpoints with real field tables,
  Go compatibility notes

Updated files:
- GettingStarted/Architecture.md — 14-subdirectory tree, real NatsClient/NatsServer
  field snippets, 9 new Go reference rows, Channel write queue design choice
- GettingStarted/Setup.md — xUnit 3, 100 test files grouped by area
- Operations/Overview.md — 99 test files, accurate Program.cs snippet, limitations
  section renamed to "Known Gaps vs Go Reference" with 7 real gaps
- Server/Overview.md — grouped fields, TLS/WS accept path, lame-duck mode, POSIX signals
- Configuration/Overview.md — 14 subsystem option tables, 24-row CLI table, LogOverrides
- Server/Client.md — Channel write queue, 4-task RunAsync, CommandMatrix, real fields

All docs verified against codebase 2026-02-23; 713 tests pass.
2026-02-23 10:14:18 -05:00

256 lines
14 KiB
Markdown

# Architecture
This document describes the overall architecture of the NATS .NET server — its layers, component responsibilities, message flow, and the mapping from the Go reference implementation.
## Project Overview
This project is a port of the [NATS server](https://github.com/nats-io/nats-server) (`golang/nats-server/`) to .NET 10 / C#. The Go source in `golang/nats-server/server/` is the authoritative reference.
Current scope: core pub/sub with wildcard subject matching and queue groups; authentication (username/password, token, NKey, JWT, TLS client certificate mapping); TLS transport; WebSocket transport; config file parsing with hot reload; clustering via routes (in-process subscription propagation and message routing); gateway and leaf node managers (bootstrapped, protocol stubs); JetStream (streams, consumers, file and memory storage, RAFT consensus); and HTTP monitoring endpoints (`/varz`, `/connz`, `/routez`, `/jsz`, etc.).
---
## Solution Layout
```
NatsDotNet.slnx
src/
NATS.Server/ # Core server library — no executable entry point
Auth/ # Auth mechanisms: username/password, token, NKey, JWT, TLS mapping
Configuration/ # Config file lexer/parser, ClusterOptions, JetStreamOptions, etc.
Events/ # Internal event system (connect/disconnect advisory subjects)
Gateways/ # GatewayManager, GatewayConnection (inter-cluster bridge)
Imports/ # Account import/export maps, service latency tracking
JetStream/ # Streams, consumers, storage, API routing, RAFT meta-group
LeafNodes/ # LeafNodeManager, LeafConnection (hub-and-spoke topology)
Monitoring/ # HTTP monitoring server: /varz, /connz, /jsz, /subsz
Protocol/ # NatsParser state machine, NatsProtocol constants and wire helpers
Raft/ # RaftNode, RaftLog, RaftReplicator, snapshot support
Routes/ # RouteManager, RouteConnection (full-mesh cluster routes)
Subscriptions/ # SubList trie, SubjectMatch, Subscription, SubListResult
Tls/ # TLS handshake wrapper, OCSP stapling, TlsRateLimiter
WebSocket/ # WsUpgrade, WsConnection, frame writer and compression
NatsClient.cs # Per-connection client: I/O pipeline, command dispatch, sub tracking
NatsServer.cs # Server orchestrator: accept loop, client registry, message routing
NatsOptions.cs # Top-level configuration model
NATS.Server.Host/ # Console application — wires logging, parses CLI args, starts server
tests/
NATS.Server.Tests/ # xUnit test project — 92 .cs test files covering all subsystems
```
`NATS.Server` depends only on `Microsoft.Extensions.Logging.Abstractions`. All Serilog wiring is in `NATS.Server.Host`. This keeps the core library testable without a console host.
---
## Layers
The implementation is organized bottom-up: each layer depends only on the layers below it.
```
NatsServer (orchestrator: accept loop, client registry, message routing)
└── NatsClient (per-connection: I/O pipeline, command dispatch, subscription tracking)
└── NatsParser (protocol state machine: bytes → ParsedCommand)
└── SubList (trie: subject → matching subscriptions)
└── SubjectMatch (validation and wildcard matching primitives)
```
### SubjectMatch and SubList
`SubjectMatch` (`Subscriptions/SubjectMatch.cs`) provides the primitive operations used throughout the subscription system:
- `IsValidSubject(string)` — rejects empty tokens, whitespace, tokens after `>`
- `IsLiteral(string)` — returns `false` if the subject contains a bare `*` or `>` wildcard token
- `IsValidPublishSubject(string)` — combines both; publish subjects must be literal
- `MatchLiteral(string literal, string pattern)` — token-by-token matching for cache maintenance
`SubList` (`Subscriptions/SubList.cs`) is a trie-based subscription store. Each trie level (`TrieLevel`) holds a `Dictionary<string, TrieNode>` for literal tokens plus dedicated `Pwc` and `Fwc` pointers for `*` and `>` wildcards. Each `TrieNode` holds `PlainSubs` (a `HashSet<Subscription>`) and `QueueSubs` (a `Dictionary<string, HashSet<Subscription>>` keyed by queue group name).
`SubList.Match(string subject)` checks an in-memory cache first, then falls back to a recursive trie walk (`MatchLevel`) if there is a cache miss. The result is stored as a `SubListResult` — an immutable snapshot containing `PlainSubs` (array) and `QueueSubs` (jagged array, one sub-array per group).
The cache holds up to 1,024 entries. When that limit is exceeded, 256 entries are swept. Wildcard subscriptions invalidate all matching cache keys on insert and removal; literal subscriptions invalidate only their own key.
### NatsParser
`NatsParser` (`Protocol/NatsParser.cs`) is a stateless-per-invocation parser that operates on `ReadOnlySequence<byte>` from `System.IO.Pipelines`. The public entry point is:
```csharp
public bool TryParse(ref ReadOnlySequence<byte> buffer, out ParsedCommand command)
```
It reads one control line at a time (up to `NatsProtocol.MaxControlLineSize` = 4,096 bytes), identifies the command by the first two lowercased bytes, then dispatches to a per-command parse method. Commands with payloads (`PUB`, `HPUB`) set `_awaitingPayload = true` and return on the next call via `TryReadPayload` once enough bytes arrive. This handles TCP fragmentation without buffering the entire payload before parsing begins.
`ParsedCommand` is a `readonly struct` — zero heap allocation for control-only commands (`PING`, `PONG`, `SUB`, `UNSUB`, `CONNECT`). Payload commands allocate a `byte[]` for the payload body.
Command dispatch in `NatsClient.DispatchCommandAsync` covers: `Connect`, `Ping`/`Pong`, `Sub`, `Unsub`, `Pub`, `HPub`.
### NatsClient
`NatsClient` (`NatsClient.cs`) handles a single TCP connection. On `RunAsync`, it sends the initial `INFO` frame and then starts two concurrent tasks: `FillPipeAsync` (socket → `PipeWriter`) and `ProcessCommandsAsync` (`PipeReader` → parser → dispatch). The tasks share a `Pipe` from `System.IO.Pipelines`. Either task completing (EOF, cancellation, or error) causes `RunAsync` to return, which triggers cleanup via `Router.RemoveClient(this)`.
Key fields:
```csharp
public sealed class NatsClient : INatsClient, IDisposable
{
private readonly Socket _socket;
private readonly Stream _stream; // plain NetworkStream or TlsConnectionWrapper
private readonly NatsParser _parser;
private readonly Channel<ReadOnlyMemory<byte>> _outbound = Channel.CreateBounded<ReadOnlyMemory<byte>>(
new BoundedChannelOptions(8192) { SingleReader = true, FullMode = BoundedChannelFullMode.Wait });
private long _pendingBytes; // bytes queued but not yet written
private readonly ClientFlagHolder _flags = new(); // ConnectReceived, TraceMode, etc.
private readonly Dictionary<string, Subscription> _subs = new();
public ulong Id { get; }
public ClientKind Kind { get; } // CLIENT, ROUTER, LEAF, SYSTEM
public Account? Account { get; private set; }
}
```
Write serialization uses a bounded `Channel<ReadOnlyMemory<byte>>(8192)` (`_outbound`). All outbound message deliveries enqueue a pre-encoded frame into this channel. A dedicated write loop drains the channel sequentially, preventing interleaved writes from concurrent message deliveries. A `_pendingBytes` counter tracks bytes queued but not yet written, enabling slow-consumer detection and back-pressure enforcement.
Subscription state is a `Dictionary<string, Subscription>` keyed by SID. This dictionary is accessed only from the single processing task, so no locking is needed. `SUB` inserts into this dictionary and into `SubList`; `UNSUB` either sets `MaxMessages` for auto-unsubscribe or immediately removes from both.
`NatsClient` exposes two interfaces to `NatsServer`:
```csharp
public interface IMessageRouter
{
void ProcessMessage(string subject, string? replyTo,
ReadOnlyMemory<byte> headers, ReadOnlyMemory<byte> payload, NatsClient sender);
void RemoveClient(NatsClient client);
}
public interface ISubListAccess
{
SubList SubList { get; }
}
```
`NatsServer` implements both. `NatsClient.Router` is set to the server instance immediately after construction.
### NatsServer
`NatsServer` (`NatsServer.cs`) owns the TCP listener, the shared `SubList`, and the client registry. Each accepted connection gets a unique `clientId` (incremented via `Interlocked.Increment`), a scoped logger, and a `NatsClient` instance registered in `_clients`. `RunClientAsync` is fired as a detached task — the accept loop does not await it.
Key fields:
```csharp
public sealed class NatsServer : IMessageRouter, ISubListAccess, IDisposable
{
// Client registry
private readonly ConcurrentDictionary<ulong, NatsClient> _clients = new();
private readonly ConcurrentQueue<ClosedClient> _closedClients = new();
private ulong _nextClientId;
private int _activeClientCount;
// Account system
private readonly ConcurrentDictionary<string, Account> _accounts = new(StringComparer.Ordinal);
private readonly Account _globalAccount;
private readonly Account _systemAccount;
private AuthService _authService;
// Subsystem managers
private readonly RouteManager? _routeManager;
private readonly GatewayManager? _gatewayManager;
private readonly LeafNodeManager? _leafNodeManager;
private readonly JetStreamService? _jetStreamService;
private MonitorServer? _monitorServer;
// TLS / transport
private readonly SslServerAuthenticationOptions? _sslOptions;
private readonly TlsRateLimiter? _tlsRateLimiter;
private Socket? _listener;
private Socket? _wsListener;
// Shutdown coordination
private readonly CancellationTokenSource _quitCts = new();
private readonly TaskCompletionSource _shutdownComplete = new(TaskCreationOptions.RunContinuationsAsynchronously);
private int _shutdown;
private int _lameDuck;
}
```
Message delivery happens in `ProcessMessage`:
1. Call `_subList.Match(subject)` to get a `SubListResult`.
2. Iterate `result.PlainSubs` — deliver to each subscriber, skipping the sender if `echo` is false.
3. Iterate `result.QueueSubs` — for each group, use round-robin (modulo `Interlocked.Increment`) to pick one member to receive the message.
Delivery via `SendMessageAsync` is fire-and-forget (not awaited) to avoid blocking the publishing client's processing task while waiting for slow subscribers.
---
## Message Flow
```
Client sends: PUB orders.new 12\r\nhello world\r\n
1. FillPipeAsync reads bytes from socket → PipeWriter
2. ProcessCommandsAsync reads from PipeReader
3. NatsParser.TryParse parses control line "PUB orders.new 12\r\n"
sets _awaitingPayload = true
on next call: reads payload + \r\n trailer
returns ParsedCommand { Type=Pub, Subject="orders.new", Payload=[...] }
4. DispatchCommandAsync calls ProcessPub(cmd)
5. ProcessPub calls Router.ProcessMessage("orders.new", null, default, payload, this)
6. NatsServer.ProcessMessage
→ _subList.Match("orders.new")
returns SubListResult { PlainSubs=[sub1, sub2], QueueSubs=[[sub3, sub4]] }
→ DeliverMessage(sub1, ...) → sub1.Client.SendMessageAsync(...)
→ DeliverMessage(sub2, ...) → sub2.Client.SendMessageAsync(...)
→ round-robin pick from [sub3, sub4], e.g. sub3
→ DeliverMessage(sub3, ...) → sub3.Client.SendMessageAsync(...)
7. SendMessageAsync enqueues encoded MSG frame into _outbound channel; write loop drains to socket
```
---
## Go Reference Mapping
| Go source | .NET source |
|-----------|-------------|
| `server/sublist.go` | `src/NATS.Server/Subscriptions/SubList.cs` |
| `server/parser.go` | `src/NATS.Server/Protocol/NatsParser.cs` |
| `server/client.go` | `src/NATS.Server/NatsClient.cs` |
| `server/server.go` | `src/NATS.Server/NatsServer.cs` |
| `server/opts.go` | `src/NATS.Server/NatsOptions.cs` + `src/NATS.Server/Configuration/` |
| `server/auth.go` | `src/NATS.Server/Auth/AuthService.cs` |
| `server/route.go` | `src/NATS.Server/Routes/RouteManager.cs` |
| `server/gateway.go` | `src/NATS.Server/Gateways/GatewayManager.cs` |
| `server/leafnode.go` | `src/NATS.Server/LeafNodes/LeafNodeManager.cs` |
| `server/jetstream.go` | `src/NATS.Server/JetStream/JetStreamService.cs` |
| `server/stream.go` | `src/NATS.Server/JetStream/StreamManager.cs` (via `JetStreamService`) |
| `server/consumer.go` | `src/NATS.Server/JetStream/ConsumerManager.cs` |
| `server/raft.go` | `src/NATS.Server/Raft/RaftNode.cs` |
| `server/monitor.go` | `src/NATS.Server/Monitoring/MonitorServer.cs` |
The Go `sublist.go` uses atomic generation counters to invalidate a result cache. The .NET `SubList` uses a different strategy: it maintains the cache under `ReaderWriterLockSlim` and does targeted invalidation at insert/remove time, avoiding the need for generation counters.
The Go `client.go` uses goroutines for `readLoop` and `writeLoop`. The .NET equivalent uses `async Task` with `System.IO.Pipelines`: `FillPipeAsync` (writer side) and `ProcessCommandsAsync` (reader side).
---
## Key .NET Design Choices
| Concern | .NET choice | Reason |
|---------|-------------|--------|
| I/O buffering | `System.IO.Pipelines` (`Pipe`, `PipeReader`, `PipeWriter`) | Zero-copy buffer management; backpressure built in |
| SubList thread safety | `ReaderWriterLockSlim` | Multiple concurrent readers (match), exclusive writers (insert/remove) |
| Client registry | `ConcurrentDictionary<ulong, NatsClient>` | Lock-free concurrent access from accept loop and cleanup tasks |
| Write serialization | `Channel<ReadOnlyMemory<byte>>(8192)` bounded queue per client with `_pendingBytes` slow-consumer tracking | Sequential drain by a single writer task prevents interleaved MSG frames; bounded capacity enables back-pressure |
| Concurrency | `async/await` + `Task` | Maps Go goroutines to .NET task-based async; no dedicated threads per connection |
| Protocol constants | `NatsProtocol` static class | Pre-encoded byte arrays (`PongBytes`, `CrLf`, etc.) avoid per-call allocations |
---
## Related Documentation
- [Setup](./Setup.md)
- [Protocol Overview](../Protocol/Overview.md)
- [Subscriptions Overview](../Subscriptions/Overview.md)
- [Server Overview](../Server/Overview.md)
- [Configuration Overview](../Configuration/Overview.md)
<!-- Last verified against codebase: 2026-02-23 -->