docs: add Authentication, Clustering, JetStream, Monitoring overviews; update existing docs

New files:
- Documentation/Authentication/Overview.md — all 7 auth mechanisms with real source
  snippets (NKey/JWT/username-password/token/TLS mapping), nonce generation, account
  system, permissions, JWT permission templates
- Documentation/Clustering/Overview.md — route TCP handshake, in-process subscription
  propagation, gateway/leaf node stubs, honest gaps list
- Documentation/JetStream/Overview.md — API surface (4 handled subjects), streams,
  consumers, storage (MemStore/FileStore), in-process RAFT, mirror/source, gaps list
- Documentation/Monitoring/Overview.md — all 12 endpoints with real field tables,
  Go compatibility notes

Updated files:
- GettingStarted/Architecture.md — 14-subdirectory tree, real NatsClient/NatsServer
  field snippets, 9 new Go reference rows, Channel write queue design choice
- GettingStarted/Setup.md — xUnit 3, 100 test files grouped by area
- Operations/Overview.md — 99 test files, accurate Program.cs snippet, limitations
  section renamed to "Known Gaps vs Go Reference" with 7 real gaps
- Server/Overview.md — grouped fields, TLS/WS accept path, lame-duck mode, POSIX signals
- Configuration/Overview.md — 14 subsystem option tables, 24-row CLI table, LogOverrides
- Server/Client.md — Channel write queue, 4-task RunAsync, CommandMatrix, real fields

All docs verified against codebase 2026-02-23; 713 tests pass.
This commit is contained in:
Joseph Doherty
2026-02-23 10:14:18 -05:00
parent 9efe787cab
commit e553db6d40
10 changed files with 2415 additions and 186 deletions

View File

@@ -7,32 +7,40 @@
### Fields and properties
```csharp
public sealed class NatsClient : IDisposable
public sealed class NatsClient : INatsClient, IDisposable
{
private static readonly ClientCommandMatrix CommandMatrix = new();
private readonly Socket _socket;
private readonly NetworkStream _stream;
private readonly Stream _stream;
private readonly NatsOptions _options;
private readonly ServerInfo _serverInfo;
private readonly AuthService _authService;
private readonly NatsParser _parser;
private readonly SemaphoreSlim _writeLock = new(1, 1);
private readonly Channel<ReadOnlyMemory<byte>> _outbound = Channel.CreateBounded<ReadOnlyMemory<byte>>(
new BoundedChannelOptions(8192) { SingleReader = true, FullMode = BoundedChannelFullMode.Wait });
private long _pendingBytes;
private CancellationTokenSource? _clientCts;
private readonly Dictionary<string, Subscription> _subs = new();
private readonly ILogger _logger;
private ClientPermissions? _permissions;
private readonly ServerStats _serverStats;
public ulong Id { get; }
public ClientKind Kind { get; }
public ClientOptions? ClientOpts { get; private set; }
public IMessageRouter? Router { get; set; }
public bool ConnectReceived { get; private set; }
public long InMsgs;
public long OutMsgs;
public long InBytes;
public long OutBytes;
public IReadOnlyDictionary<string, Subscription> Subscriptions => _subs;
public Account? Account { get; private set; }
public DateTime StartTime { get; }
private readonly ClientFlagHolder _flags = new();
public bool ConnectReceived => _flags.HasFlag(ClientFlags.ConnectReceived);
public ClientClosedReason CloseReason { get; private set; }
}
```
`_writeLock` is a `SemaphoreSlim(1, 1)` — a binary semaphore that serializes all writes to `_stream`. Without it, concurrent `SendMessageAsync` calls from different publisher threads would interleave bytes on the wire. See [Write serialization](#write-serialization) below.
`_stream` is typed as `Stream` rather than `NetworkStream` because the server passes in a pre-wrapped stream: plain `NetworkStream` for unencrypted connections, `SslStream` for TLS, or a WebSocket framing adapter. `NatsClient` does not know or care which transport is underneath.
`_outbound` is a bounded `Channel<ReadOnlyMemory<byte>>(8192)` with `SingleReader = true` and `FullMode = BoundedChannelFullMode.Wait`. The channel is the sole path for all outbound frames. Slow consumer detection uses `_pendingBytes` — an `Interlocked`-maintained counter of bytes queued but not yet flushed — checked against `_options.MaxPending` in `QueueOutbound`. See [Write Serialization](#write-serialization) below.
`_flags` is a `ClientFlagHolder` (a thin wrapper around an `int` with atomic bit operations). Protocol-level boolean state — `ConnectReceived`, `CloseConnection`, `IsSlowConsumer`, `TraceMode`, and others — is stored as flag bits rather than separate fields, keeping the state machine manipulation thread-safe without separate locks.
`_subs` maps subscription IDs (SIDs) to `Subscription` objects. SIDs are client-assigned strings; `Dictionary<string, Subscription>` gives O(1) lookup for UNSUB processing.
@@ -43,21 +51,30 @@ The four stat fields (`InMsgs`, `OutMsgs`, `InBytes`, `OutBytes`) are `long` fie
### Constructor
```csharp
public NatsClient(ulong id, Socket socket, NatsOptions options, ServerInfo serverInfo, ILogger logger)
public NatsClient(ulong id, Stream stream, Socket socket, NatsOptions options, ServerInfo serverInfo,
AuthService authService, byte[]? nonce, ILogger logger, ServerStats serverStats,
ClientKind kind = ClientKind.Client)
{
Id = id;
Kind = kind;
_socket = socket;
_stream = new NetworkStream(socket, ownsSocket: false);
_stream = stream;
_options = options;
_serverInfo = serverInfo;
_authService = authService;
_logger = logger;
_parser = new NatsParser(options.MaxPayload);
_serverStats = serverStats;
_parser = new NatsParser(options.MaxPayload, options.Trace ? logger : null);
StartTime = DateTime.UtcNow;
}
```
`NetworkStream` is created with `ownsSocket: false`. This keeps socket lifetime management in `NatsServer`, which disposes the socket explicitly in `Dispose`. If `ownsSocket` were `true`, disposing the `NetworkStream` would close the socket, potentially racing with the disposal path in `NatsServer`.
The `stream` parameter is passed in by `NatsServer` already wrapped for the appropriate transport. For a plain TCP connection it is a `NetworkStream`; after a TLS handshake it is an `SslStream`; for WebSocket connections it is a WebSocket framing adapter. `NatsClient` writes to `Stream` throughout and is unaware of which transport is underneath.
`NatsParser` is constructed with `MaxPayload` from options. The parser enforces this limit: a payload larger than `MaxPayload` causes a `ProtocolViolationException` and terminates the connection.
`authService` is the shared `AuthService` instance. `NatsClient` calls `authService.IsAuthRequired` and `authService.Authenticate(context)` during CONNECT processing rather than performing authentication checks inline. `serverStats` is a shared `ServerStats` struct updated via `Interlocked` operations on the hot path (message counts, slow consumer counts, stale connections).
`byte[]? nonce` carries a pre-generated challenge value for NKey authentication. When non-null, it is embedded in the INFO payload sent to the client. After `ProcessConnectAsync` completes, the nonce is zeroed via `CryptographicOperations.ZeroMemory` as a defense-in-depth measure.
`NatsParser` is constructed with `MaxPayload` from options. The parser enforces this limit: a payload larger than `MaxPayload` causes the connection to be closed with `ClientClosedReason.MaxPayloadExceeded`.
## Connection Lifecycle
@@ -68,23 +85,28 @@ public NatsClient(ulong id, Socket socket, NatsOptions options, ServerInfo serve
```csharp
public async Task RunAsync(CancellationToken ct)
{
_clientCts = CancellationTokenSource.CreateLinkedTokenSource(ct);
var pipe = new Pipe();
try
{
await SendInfoAsync(ct);
if (!InfoAlreadySent)
SendInfo();
var fillTask = FillPipeAsync(pipe.Writer, ct);
var processTask = ProcessCommandsAsync(pipe.Reader, ct);
var fillTask = FillPipeAsync(pipe.Writer, _clientCts.Token);
var processTask = ProcessCommandsAsync(pipe.Reader, _clientCts.Token);
var pingTask = RunPingTimerAsync(_clientCts.Token);
var writeTask = RunWriteLoopAsync(_clientCts.Token);
await Task.WhenAny(fillTask, processTask);
await Task.WhenAny(fillTask, processTask, pingTask, writeTask);
}
catch (OperationCanceledException) { }
catch (Exception ex)
catch (OperationCanceledException)
{
_logger.LogDebug(ex, "Client {ClientId} connection error", Id);
MarkClosed(ClientClosedReason.ServerShutdown);
}
finally
{
MarkClosed(ClientClosedReason.ClientClosed);
_outbound.Writer.TryComplete();
Router?.RemoveClient(this);
}
}
@@ -92,10 +114,17 @@ public async Task RunAsync(CancellationToken ct)
The method:
1. Sends `INFO {json}\r\n` immediately on connect — required by the NATS protocol before the client sends CONNECT.
2. Creates a `System.IO.Pipelines.Pipe` and starts two concurrent tasks: `FillPipeAsync` reads bytes from the socket into the pipe's write end; `ProcessCommandsAsync` reads from the pipe's read end and dispatches commands.
3. Awaits `Task.WhenAny`. Either task completing signals the connection is done — either the socket closed (fill task returns) or a protocol error caused the process task to throw.
4. In `finally`, calls `Router?.RemoveClient(this)` to clean up subscriptions and remove the client from the server's client dictionary.
1. Creates `_clientCts` as a `CancellationTokenSource.CreateLinkedTokenSource(ct)`. This gives the client its own cancellation scope linked to the server-wide token. `CloseWithReasonAsync` cancels `_clientCts` to tear down only this connection without affecting the rest of the server.
2. Calls `SendInfo()` unless `InfoAlreadySent` is set — TLS negotiation sends INFO before handing the `SslStream` to `RunAsync`, so the flag prevents a duplicate INFO on TLS connections.
3. Starts four concurrent tasks using `_clientCts.Token`:
- `FillPipeAsync` — reads bytes from `_stream` into the pipe's write end.
- `ProcessCommandsAsync` — reads from the pipe's read end and dispatches commands.
- `RunPingTimerAsync` — sends periodic PING frames and enforces stale-connection detection via `_options.MaxPingsOut`.
- `RunWriteLoopAsync` — drains `_outbound` channel frames and writes them to `_stream`.
4. Awaits `Task.WhenAny`. Any task completing signals the connection is ending — the socket closed, a protocol error was detected, or the server is shutting down.
5. In `finally`, calls `MarkClosed(ClientClosedReason.ClientClosed)` (first-write-wins; earlier calls from error paths set the actual reason), completes the outbound channel writer so `RunWriteLoopAsync` can drain and exit, then calls `Router?.RemoveClient(this)` to remove subscriptions and deregister the client.
`CloseWithReasonAsync(reason, errMessage)` is the coordinated close path used by protocol violations and slow consumer detection. It sets `CloseReason`, optionally queues a `-ERR` frame, marks the `CloseConnection` flag, completes the channel writer, waits 50 ms for the write loop to flush the error frame, then cancels `_clientCts`. `MarkClosed(reason)` is the lighter first-writer-wins setter used by the `RunAsync` catch blocks.
`Router?.RemoveClient(this)` uses a null-conditional because `Router` could be null if the client is used in a test context without a server.
@@ -166,48 +195,53 @@ private async Task ProcessCommandsAsync(PipeReader reader, CancellationToken ct)
## Command Dispatch
`DispatchCommandAsync` switches on the `CommandType` returned by the parser:
`DispatchCommandAsync` first consults `CommandMatrix` to verify the command is permitted for this client's `Kind`, then dispatches by `CommandType`:
```csharp
private async ValueTask DispatchCommandAsync(ParsedCommand cmd, CancellationToken ct)
{
Interlocked.Exchange(ref _lastActivityTicks, DateTime.UtcNow.Ticks);
if (!CommandMatrix.IsAllowed(Kind, cmd.Operation))
{
await SendErrAndCloseAsync("Parser Error");
return;
}
switch (cmd.Type)
{
case CommandType.Connect:
ProcessConnect(cmd);
await ProcessConnectAsync(cmd);
break;
case CommandType.Ping:
await WriteAsync(NatsProtocol.PongBytes, ct);
WriteProtocol(NatsProtocol.PongBytes);
break;
case CommandType.Pong:
// Update RTT tracking (placeholder)
Interlocked.Exchange(ref _pingsOut, 0);
Interlocked.Exchange(ref _rtt, DateTime.UtcNow.Ticks - Interlocked.Read(ref _rttStartTicks));
_flags.SetFlag(ClientFlags.FirstPongSent);
break;
case CommandType.Sub:
ProcessSub(cmd);
break;
case CommandType.Unsub:
ProcessUnsub(cmd);
break;
case CommandType.Pub:
case CommandType.HPub:
ProcessPub(cmd);
break;
}
}
```
`ClientCommandMatrix` is a static lookup table keyed by `ClientKind`. Each `ClientKind` has an allowed set of `CommandType` values. `Kind.Client` accepts the standard client command set (CONNECT, PING, PONG, SUB, UNSUB, PUB, HPUB). Router-kind clients additionally accept `RS+` and `RS-` subscription propagation messages used for cluster route subscription exchange. If a command is not allowed for the current kind, the connection is closed with `Parser Error`.
Every command dispatch updates `_lastActivityTicks` via `Interlocked.Exchange`. The ping timer in `RunPingTimerAsync` reads `_lastIn` (updated on every received byte batch) to decide whether the client was recently active; `_lastActivityTicks` is the higher-level timestamp exposed as `LastActivity` on the public interface.
### CONNECT
`ProcessConnect` deserializes the JSON payload into a `ClientOptions` record and sets `ConnectReceived = true`. `ClientOptions` carries the `echo` flag (default `true`), the client name, language, and version strings.
### PING / PONG
PING is responded to immediately with the pre-allocated `NatsProtocol.PongBytes` (`"PONG\r\n"`). The response goes through `WriteAsync`, which acquires the write lock. PONG handling is currently a placeholder for future RTT tracking.
PING is responded to immediately with the pre-allocated `NatsProtocol.PongBytes` (`"PONG\r\n"`) via `WriteProtocol`, which calls `QueueOutbound`. PONG resets `_pingsOut` to 0 (preventing stale-connection closure), records RTT by comparing the current tick count against `_rttStartTicks` set when the PING was sent, and sets the `ClientFlags.FirstPongSent` flag to unblock the initial ping timer delay.
### SUB
@@ -284,49 +318,74 @@ Stats are updated before routing. For HPUB, the combined payload from the parser
## Write Serialization
Multiple concurrent `SendMessageAsync` calls can arrive from different publisher connections at the same time. Without coordination, their writes would interleave on the socket and corrupt the message stream for the receiving client. `_writeLock` prevents this:
All outbound frames flow through a bounded `Channel<ReadOnlyMemory<byte>>` named `_outbound`. The channel has a capacity of 8192 entries, `SingleReader = true`, and `FullMode = BoundedChannelFullMode.Wait`. Every caller that wants to send bytes — protocol responses, MSG deliveries, PING frames, INFO, ERR — calls `QueueOutbound(data)`, which performs two checks before writing to the channel:
```csharp
public async Task SendMessageAsync(string subject, string sid, string? replyTo,
ReadOnlyMemory<byte> headers, ReadOnlyMemory<byte> payload, CancellationToken ct)
public bool QueueOutbound(ReadOnlyMemory<byte> data)
{
Interlocked.Increment(ref OutMsgs);
Interlocked.Add(ref OutBytes, payload.Length + headers.Length);
if (_flags.HasFlag(ClientFlags.CloseConnection))
return false;
byte[] line;
if (headers.Length > 0)
var pending = Interlocked.Add(ref _pendingBytes, data.Length);
if (pending > _options.MaxPending)
{
int totalSize = headers.Length + payload.Length;
line = Encoding.ASCII.GetBytes(
$"HMSG {subject} {sid} {(replyTo != null ? replyTo + " " : "")}{headers.Length} {totalSize}\r\n");
}
else
{
line = Encoding.ASCII.GetBytes(
$"MSG {subject} {sid} {(replyTo != null ? replyTo + " " : "")}{payload.Length}\r\n");
Interlocked.Add(ref _pendingBytes, -data.Length);
_flags.SetFlag(ClientFlags.IsSlowConsumer);
Interlocked.Increment(ref _serverStats.SlowConsumers);
_ = CloseWithReasonAsync(ClientClosedReason.SlowConsumerPendingBytes, NatsProtocol.ErrSlowConsumer);
return false;
}
await _writeLock.WaitAsync(ct);
try
if (!_outbound.Writer.TryWrite(data))
{
await _stream.WriteAsync(line, ct);
if (headers.Length > 0)
await _stream.WriteAsync(headers, ct);
if (payload.Length > 0)
await _stream.WriteAsync(payload, ct);
await _stream.WriteAsync(NatsProtocol.CrLf, ct);
await _stream.FlushAsync(ct);
// Channel is full (all 8192 slots taken) -- slow consumer
_flags.SetFlag(ClientFlags.IsSlowConsumer);
_ = CloseWithReasonAsync(ClientClosedReason.SlowConsumerPendingBytes, NatsProtocol.ErrSlowConsumer);
return false;
}
finally
return true;
}
```
`_pendingBytes` is an `Interlocked`-maintained counter. When it exceeds `_options.MaxPending`, the client is classified as a slow consumer and `CloseWithReasonAsync` is called. If `TryWrite` fails (all 8192 channel slots are occupied), the same slow consumer path fires. In either case the connection is closed with `-ERR 'Slow Consumer'`.
`RunWriteLoopAsync` is the sole reader of the channel, running as one of the four concurrent tasks in `RunAsync`:
```csharp
private async Task RunWriteLoopAsync(CancellationToken ct)
{
var reader = _outbound.Reader;
while (await reader.WaitToReadAsync(ct))
{
_writeLock.Release();
long batchBytes = 0;
while (reader.TryRead(out var data))
{
await _stream.WriteAsync(data, ct);
batchBytes += data.Length;
}
using var flushCts = CancellationTokenSource.CreateLinkedTokenSource(ct);
flushCts.CancelAfter(_options.WriteDeadline);
try
{
await _stream.FlushAsync(flushCts.Token);
}
catch (OperationCanceledException) when (!ct.IsCancellationRequested)
{
// Flush timed out -- slow consumer on the write side
await CloseWithReasonAsync(ClientClosedReason.SlowConsumerWriteDeadline, NatsProtocol.ErrSlowConsumer);
return;
}
Interlocked.Add(ref _pendingBytes, -batchBytes);
}
}
```
The control line is constructed before acquiring the lock so the string formatting work happens outside the critical section. Once the lock is held, all writes for one message — control line, optional headers, payload, and trailing `\r\n` — happen atomically from the perspective of other writers.
`WaitToReadAsync` yields until at least one frame is available. The inner `TryRead` loop drains as many frames as are available without yielding, batching them into a single `FlushAsync`. This amortizes the flush cost over multiple frames when the client is keeping up. After the flush, `_pendingBytes` is decremented by the batch size.
Stats (`OutMsgs`, `OutBytes`) are updated before the lock because they are independent of the write ordering constraint.
If `FlushAsync` does not complete within `_options.WriteDeadline`, the write-deadline slow consumer path fires. `WriteDeadline` is distinct from `MaxPending`: `MaxPending` catches a client whose channel is backing up due to slow reads; `WriteDeadline` catches a client whose OS socket send buffer is stalled (e.g. the TCP window is closed).
## Subscription Cleanup
@@ -348,22 +407,25 @@ This removes every subscription this client holds from the shared `SubList` trie
```csharp
public void Dispose()
{
_permissions?.Dispose();
_outbound.Writer.TryComplete();
_clientCts?.Dispose();
_stream.Dispose();
_socket.Dispose();
_writeLock.Dispose();
}
```
Disposing `_stream` closes the network stream. Disposing `_socket` closes the OS socket. Any in-flight `ReadAsync` or `WriteAsync` will fault with an `ObjectDisposedException` or `IOException`, which causes the read/write tasks to terminate. `_writeLock` is disposed last to release the `SemaphoreSlim`'s internal handle.
`_outbound.Writer.TryComplete()` is called before disposing the stream so that `RunWriteLoopAsync` can observe channel completion and exit cleanly rather than faulting on a disposed stream. `_clientCts` is disposed to release the linked token registration. Disposing `_stream` and `_socket` closes the underlying transport; any in-flight `ReadAsync` or `WriteAsync` will fault with an `ObjectDisposedException` or `IOException`, which causes the remaining tasks to terminate.
## Go Reference
The Go counterpart is `golang/nats-server/server/client.go`. Key differences in the .NET port:
- Go uses separate goroutines for `readLoop` and `writeLoop`; the .NET port uses `FillPipeAsync` and `ProcessCommandsAsync` as concurrent `Task`s sharing a `Pipe`.
- Go uses separate goroutines for `readLoop` and `writeLoop`; the .NET port uses `FillPipeAsync`, `ProcessCommandsAsync`, `RunPingTimerAsync`, and `RunWriteLoopAsync` as four concurrent `Task`s all linked to `_clientCts`.
- Go uses dynamic buffer sizing (512 to 65536 bytes) in `readLoop`; the .NET port requests 4096-byte chunks from the `PipeWriter`.
- Go uses a mutex for write serialization (`c.mu`); the .NET port uses `SemaphoreSlim(1,1)` to allow `await`-based waiting without blocking a thread.
- The `System.IO.Pipelines` `Pipe` replaces Go's direct `net.Conn` reads. This separates the I/O pump from command parsing and avoids partial-read handling in the parser itself.
- Go uses a static per-client read buffer; the .NET port uses `System.IO.Pipelines` for zero-copy parsing. The pipe separates the I/O pump from command parsing, avoids partial-read handling in the parser, and allows the `PipeReader` backpressure mechanism to control how much data is buffered between fill and process.
- Go's `flushOutbound()` batches queued writes and flushes them under `c.mu`; the .NET port uses a bounded `Channel<ReadOnlyMemory<byte>>(8192)` write queue with a `_pendingBytes` counter for backpressure. `RunWriteLoopAsync` is the sole reader: it drains all available frames in one batch and calls `FlushAsync` once per batch, with a `WriteDeadline` timeout to detect stale write-side connections.
- Go uses `c.mu` (a sync.Mutex) for write serialization; the .NET port eliminates the write lock entirely — `RunWriteLoopAsync` is the only goroutine that writes to `_stream`, so no locking is required on the write path.
## Related Documentation
@@ -372,4 +434,4 @@ The Go counterpart is `golang/nats-server/server/client.go`. Key differences in
- [SubList Trie](../Subscriptions/SubList.md)
- [Subscriptions Overview](../Subscriptions/Overview.md)
<!-- Last verified against codebase: 2026-02-22 -->
<!-- Last verified against codebase: 2026-02-23 -->

View File

@@ -31,20 +31,46 @@ Defining them separately makes unit testing straightforward: a test can supply a
```csharp
public sealed class NatsServer : IMessageRouter, ISubListAccess, IDisposable
{
private readonly NatsOptions _options;
// Client registry
private readonly ConcurrentDictionary<ulong, NatsClient> _clients = new();
private readonly SubList _subList = new();
private readonly ServerInfo _serverInfo;
private readonly ILogger<NatsServer> _logger;
private readonly ILoggerFactory _loggerFactory;
private Socket? _listener;
private readonly ConcurrentQueue<ClosedClient> _closedClients = new();
private ulong _nextClientId;
private int _activeClientCount;
public SubList SubList => _subList;
// Account system
private readonly ConcurrentDictionary<string, Account> _accounts = new(StringComparer.Ordinal);
private readonly Account _globalAccount;
private readonly Account _systemAccount;
private AuthService _authService;
// Subsystem managers (null when not configured)
private readonly RouteManager? _routeManager;
private readonly GatewayManager? _gatewayManager;
private readonly LeafNodeManager? _leafNodeManager;
private readonly JetStreamService? _jetStreamService;
private readonly JetStreamPublisher? _jetStreamPublisher;
private MonitorServer? _monitorServer;
// TLS / transport
private readonly SslServerAuthenticationOptions? _sslOptions;
private readonly TlsRateLimiter? _tlsRateLimiter;
private Socket? _listener;
private Socket? _wsListener;
// Shutdown coordination
private readonly CancellationTokenSource _quitCts = new();
private readonly TaskCompletionSource _shutdownComplete = new(TaskCreationOptions.RunContinuationsAsynchronously);
private readonly TaskCompletionSource _acceptLoopExited = new(TaskCreationOptions.RunContinuationsAsynchronously);
private int _shutdown;
private int _lameDuck;
public SubList SubList => _globalAccount.SubList;
}
```
`_clients` tracks every live connection. `_nextClientId` is incremented with `Interlocked.Increment` for each accepted socket, producing monotonically increasing client IDs without a lock. `_loggerFactory` is retained so per-client loggers can be created at accept time, each tagged with the client ID.
`_clients` tracks every live connection. `_closedClients` holds a capped ring of recently disconnected client snapshots (used by `/connz`). `_nextClientId` is incremented with `Interlocked.Increment` for each accepted socket, producing monotonically increasing client IDs without a lock. `_loggerFactory` is retained so per-client loggers can be created at accept time, each tagged with the client ID.
Each subsystem manager field (`_routeManager`, `_gatewayManager`, `_leafNodeManager`, `_jetStreamService`, `_monitorServer`) is `null` when the corresponding options section is absent from the configuration. Code that interacts with these managers always guards with a null check.
### Constructor
@@ -70,6 +96,10 @@ public NatsServer(NatsOptions options, ILoggerFactory loggerFactory)
The `ServerId` is derived from a GUID — taking the first 20 characters of its `"N"` format (32 hex digits, no hyphens) and uppercasing them. This matches the fixed-length alphanumeric server ID format used by the Go server.
Subsystem managers are instantiated in the constructor if the corresponding options sections are non-null: `options.Cluster != null` creates a `RouteManager`, `options.Gateway != null` creates a `GatewayManager`, `options.LeafNode != null` creates a `LeafNodeManager`, and `options.JetStream != null` creates `JetStreamService`, `JetStreamApiRouter`, `StreamManager`, `ConsumerManager`, and `JetStreamPublisher`. TLS options are compiled into `SslServerAuthenticationOptions` via `TlsHelper.BuildServerAuthOptions` when `options.HasTls` is true.
Before entering the accept loop, `StartAsync` starts the monitoring server, WebSocket listener, route connections, gateway connections, leaf node listener, and JetStream service.
## Accept Loop
`StartAsync` binds the socket, enables `SO_REUSEADDR` so the port can be reused immediately after a restart, and enters an async accept loop:
@@ -103,6 +133,37 @@ public async Task StartAsync(CancellationToken ct)
The backlog of 128 passed to `Listen` controls the OS-level queue of unaccepted connections — matching the Go server default.
### TLS wrapping and WebSocket upgrade
After `AcceptAsync` returns a socket, the connection is handed to `AcceptClientAsync`, which performs transport negotiation before constructing `NatsClient`:
```csharp
private async Task AcceptClientAsync(Socket socket, ulong clientId, CancellationToken ct)
{
if (_tlsRateLimiter != null)
await _tlsRateLimiter.WaitAsync(ct);
var networkStream = new NetworkStream(socket, ownsSocket: false);
// TlsConnectionWrapper performs the TLS handshake if _sslOptions is set;
// returns the raw NetworkStream unchanged when TLS is not configured.
var (stream, infoAlreadySent) = await TlsConnectionWrapper.NegotiateAsync(
socket, networkStream, _options, _sslOptions, _serverInfo,
_loggerFactory.CreateLogger("NATS.Server.Tls"), ct);
// ...auth nonce generation, TLS state extraction...
var client = new NatsClient(clientId, stream, socket, _options, clientInfo,
_authService, nonce, clientLogger, _stats);
client.Router = this;
client.TlsState = tlsState;
client.InfoAlreadySent = infoAlreadySent;
_clients[clientId] = client;
}
```
WebSocket connections follow a parallel path through `AcceptWebSocketClientAsync`. After optional TLS negotiation via `TlsConnectionWrapper`, the HTTP upgrade handshake is performed by `WsUpgrade.TryUpgradeAsync`. On success, the raw stream is wrapped in a `WsConnection` that handles WebSocket framing, masking, and per-message compression before `NatsClient` is constructed.
## Message Routing
`ProcessMessage` is called by `NatsClient` for every PUB or HPUB command. It is the hot path: called once per published message.
@@ -175,9 +236,11 @@ private static void DeliverMessage(Subscription sub, string subject, string? rep
}
```
`MessageCount` is incremented atomically before the send. If it exceeds `MaxMessages` (set by an UNSUB with a message count argument), the message is silently dropped. The subscription itself is not removed here — removal happens when the client processes the count limit through `ProcessUnsub`, or when the client disconnects and `RemoveAllSubscriptions` is called.
`MessageCount` is incremented atomically before the send. If it exceeds `MaxMessages` (set by an UNSUB with a message count argument), the subscription is removed from the trie immediately (`subList.Remove(sub)`) and from the client's tracking table (`client.RemoveSubscription(sub.Sid)`), then the message is dropped without delivery.
`SendMessageAsync` is again fire-and-forget. Multiple deliveries to different clients happen concurrently.
`SendMessage` enqueues the serialized wire bytes on the client's outbound channel. Multiple deliveries to different clients happen concurrently.
After local delivery, `ProcessMessage` forwards to the JetStream publisher first: if the subject matches a configured stream, `TryCaptureJetStreamPublish` stores the message and the `PubAck` is sent back to the publisher via `sender.RecordJetStreamPubAck`. Route forwarding is handled separately by `OnLocalSubscription`, which calls `_routeManager?.PropagateLocalSubscription` when a new subscription is added — keeping remote peers informed of local interest without re-routing individual messages inside `ProcessMessage`.
## Client Removal
@@ -193,17 +256,34 @@ public void RemoveClient(NatsClient client)
## Shutdown and Dispose
Graceful shutdown is initiated by `ShutdownAsync`. It uses `_quitCts` — a `CancellationTokenSource` shared between `StartAsync` and all subsystem managers — to signal all internal loops to stop:
```csharp
public void Dispose()
public async Task ShutdownAsync()
{
_listener?.Dispose();
foreach (var client in _clients.Values)
client.Dispose();
_subList.Dispose();
if (Interlocked.CompareExchange(ref _shutdown, 1, 0) != 0)
return; // Already shutting down
// Signal all internal loops to stop
await _quitCts.CancelAsync();
// Close listeners to stop accept loops
_listener?.Close();
_wsListener?.Close();
if (_routeManager != null) await _routeManager.DisposeAsync();
if (_gatewayManager != null) await _gatewayManager.DisposeAsync();
if (_leafNodeManager != null) await _leafNodeManager.DisposeAsync();
if (_jetStreamService != null) await _jetStreamService.DisposeAsync();
// Wait for accept loops to exit, flush and close clients, drain active tasks...
if (_monitorServer != null) await _monitorServer.DisposeAsync();
_shutdownComplete.TrySetResult();
}
```
Disposing the listener socket causes `AcceptAsync` to throw, which unwinds `StartAsync`. Client sockets are disposed, which closes their `NetworkStream` and causes their read loops to terminate. `SubList.Dispose` releases its `ReaderWriterLockSlim`.
Lame-duck mode is a two-phase variant initiated by `LameDuckShutdownAsync`. The `_lameDuck` field (checked via `IsLameDuckMode`) is set first, which stops the accept loops from receiving new connections while existing clients are given a grace period (`options.LameDuckGracePeriod`) to disconnect naturally. After the grace period, remaining clients are stagger-closed over `options.LameDuckDuration` to avoid a thundering herd of reconnects, then `ShutdownAsync` completes the teardown.
`Dispose` is a synchronous fallback. If `ShutdownAsync` has not already run, it blocks on it. It then disposes `_quitCts`, `_tlsRateLimiter`, the listener sockets, all subsystem managers (route, gateway, leaf node, JetStream), all connected clients, and all accounts. PosixSignalRegistrations are also disposed, deregistering the signal handlers.
## Go Reference
@@ -212,6 +292,7 @@ The Go counterpart is `golang/nats-server/server/server.go`. Key differences in
- Go uses goroutines for the accept loop and per-client read/write loops; the .NET port uses `async`/`await` with `Task`.
- Go uses `sync/atomic` for client ID generation; the .NET port uses `Interlocked.Increment`.
- Go passes the server to clients via the `srv` field on the client struct; the .NET port uses the `IMessageRouter` interface through the `Router` property.
- POSIX signal handlers — `SIGTERM`/`SIGQUIT` for shutdown, `SIGHUP` for config reload, `SIGUSR1` for log file reopen, `SIGUSR2` for lame-duck mode — are registered in `HandleSignals` via `PosixSignalRegistration.Create`. `SIGUSR1` and `SIGUSR2` are skipped on Windows. Registrations are stored in `_signalRegistrations` and disposed during `Dispose`.
## Related Documentation
@@ -220,4 +301,4 @@ The Go counterpart is `golang/nats-server/server/server.go`. Key differences in
- [Protocol Overview](../Protocol/Overview.md)
- [Configuration](../Configuration/Overview.md)
<!-- Last verified against codebase: 2026-02-22 -->
<!-- Last verified against codebase: 2026-02-23 -->