docs: add Authentication, Clustering, JetStream, Monitoring overviews; update existing docs

New files:
- Documentation/Authentication/Overview.md — all 7 auth mechanisms with real source
  snippets (NKey/JWT/username-password/token/TLS mapping), nonce generation, account
  system, permissions, JWT permission templates
- Documentation/Clustering/Overview.md — route TCP handshake, in-process subscription
  propagation, gateway/leaf node stubs, honest gaps list
- Documentation/JetStream/Overview.md — API surface (4 handled subjects), streams,
  consumers, storage (MemStore/FileStore), in-process RAFT, mirror/source, gaps list
- Documentation/Monitoring/Overview.md — all 12 endpoints with real field tables,
  Go compatibility notes

Updated files:
- GettingStarted/Architecture.md — 14-subdirectory tree, real NatsClient/NatsServer
  field snippets, 9 new Go reference rows, Channel write queue design choice
- GettingStarted/Setup.md — xUnit 3, 100 test files grouped by area
- Operations/Overview.md — 99 test files, accurate Program.cs snippet, limitations
  section renamed to "Known Gaps vs Go Reference" with 7 real gaps
- Server/Overview.md — grouped fields, TLS/WS accept path, lame-duck mode, POSIX signals
- Configuration/Overview.md — 14 subsystem option tables, 24-row CLI table, LogOverrides
- Server/Client.md — Channel write queue, 4-task RunAsync, CommandMatrix, real fields

All docs verified against codebase 2026-02-23; 713 tests pass.
This commit is contained in:
Joseph Doherty
2026-02-23 10:14:18 -05:00
parent 9efe787cab
commit e553db6d40
10 changed files with 2415 additions and 186 deletions

View File

@@ -31,20 +31,46 @@ Defining them separately makes unit testing straightforward: a test can supply a
```csharp
public sealed class NatsServer : IMessageRouter, ISubListAccess, IDisposable
{
private readonly NatsOptions _options;
// Client registry
private readonly ConcurrentDictionary<ulong, NatsClient> _clients = new();
private readonly SubList _subList = new();
private readonly ServerInfo _serverInfo;
private readonly ILogger<NatsServer> _logger;
private readonly ILoggerFactory _loggerFactory;
private Socket? _listener;
private readonly ConcurrentQueue<ClosedClient> _closedClients = new();
private ulong _nextClientId;
private int _activeClientCount;
public SubList SubList => _subList;
// Account system
private readonly ConcurrentDictionary<string, Account> _accounts = new(StringComparer.Ordinal);
private readonly Account _globalAccount;
private readonly Account _systemAccount;
private AuthService _authService;
// Subsystem managers (null when not configured)
private readonly RouteManager? _routeManager;
private readonly GatewayManager? _gatewayManager;
private readonly LeafNodeManager? _leafNodeManager;
private readonly JetStreamService? _jetStreamService;
private readonly JetStreamPublisher? _jetStreamPublisher;
private MonitorServer? _monitorServer;
// TLS / transport
private readonly SslServerAuthenticationOptions? _sslOptions;
private readonly TlsRateLimiter? _tlsRateLimiter;
private Socket? _listener;
private Socket? _wsListener;
// Shutdown coordination
private readonly CancellationTokenSource _quitCts = new();
private readonly TaskCompletionSource _shutdownComplete = new(TaskCreationOptions.RunContinuationsAsynchronously);
private readonly TaskCompletionSource _acceptLoopExited = new(TaskCreationOptions.RunContinuationsAsynchronously);
private int _shutdown;
private int _lameDuck;
public SubList SubList => _globalAccount.SubList;
}
```
`_clients` tracks every live connection. `_nextClientId` is incremented with `Interlocked.Increment` for each accepted socket, producing monotonically increasing client IDs without a lock. `_loggerFactory` is retained so per-client loggers can be created at accept time, each tagged with the client ID.
`_clients` tracks every live connection. `_closedClients` holds a capped ring of recently disconnected client snapshots (used by `/connz`). `_nextClientId` is incremented with `Interlocked.Increment` for each accepted socket, producing monotonically increasing client IDs without a lock. `_loggerFactory` is retained so per-client loggers can be created at accept time, each tagged with the client ID.
Each subsystem manager field (`_routeManager`, `_gatewayManager`, `_leafNodeManager`, `_jetStreamService`, `_monitorServer`) is `null` when the corresponding options section is absent from the configuration. Code that interacts with these managers always guards with a null check.
### Constructor
@@ -70,6 +96,10 @@ public NatsServer(NatsOptions options, ILoggerFactory loggerFactory)
The `ServerId` is derived from a GUID — taking the first 20 characters of its `"N"` format (32 hex digits, no hyphens) and uppercasing them. This matches the fixed-length alphanumeric server ID format used by the Go server.
Subsystem managers are instantiated in the constructor if the corresponding options sections are non-null: `options.Cluster != null` creates a `RouteManager`, `options.Gateway != null` creates a `GatewayManager`, `options.LeafNode != null` creates a `LeafNodeManager`, and `options.JetStream != null` creates `JetStreamService`, `JetStreamApiRouter`, `StreamManager`, `ConsumerManager`, and `JetStreamPublisher`. TLS options are compiled into `SslServerAuthenticationOptions` via `TlsHelper.BuildServerAuthOptions` when `options.HasTls` is true.
Before entering the accept loop, `StartAsync` starts the monitoring server, WebSocket listener, route connections, gateway connections, leaf node listener, and JetStream service.
## Accept Loop
`StartAsync` binds the socket, enables `SO_REUSEADDR` so the port can be reused immediately after a restart, and enters an async accept loop:
@@ -103,6 +133,37 @@ public async Task StartAsync(CancellationToken ct)
The backlog of 128 passed to `Listen` controls the OS-level queue of unaccepted connections — matching the Go server default.
### TLS wrapping and WebSocket upgrade
After `AcceptAsync` returns a socket, the connection is handed to `AcceptClientAsync`, which performs transport negotiation before constructing `NatsClient`:
```csharp
private async Task AcceptClientAsync(Socket socket, ulong clientId, CancellationToken ct)
{
if (_tlsRateLimiter != null)
await _tlsRateLimiter.WaitAsync(ct);
var networkStream = new NetworkStream(socket, ownsSocket: false);
// TlsConnectionWrapper performs the TLS handshake if _sslOptions is set;
// returns the raw NetworkStream unchanged when TLS is not configured.
var (stream, infoAlreadySent) = await TlsConnectionWrapper.NegotiateAsync(
socket, networkStream, _options, _sslOptions, _serverInfo,
_loggerFactory.CreateLogger("NATS.Server.Tls"), ct);
// ...auth nonce generation, TLS state extraction...
var client = new NatsClient(clientId, stream, socket, _options, clientInfo,
_authService, nonce, clientLogger, _stats);
client.Router = this;
client.TlsState = tlsState;
client.InfoAlreadySent = infoAlreadySent;
_clients[clientId] = client;
}
```
WebSocket connections follow a parallel path through `AcceptWebSocketClientAsync`. After optional TLS negotiation via `TlsConnectionWrapper`, the HTTP upgrade handshake is performed by `WsUpgrade.TryUpgradeAsync`. On success, the raw stream is wrapped in a `WsConnection` that handles WebSocket framing, masking, and per-message compression before `NatsClient` is constructed.
## Message Routing
`ProcessMessage` is called by `NatsClient` for every PUB or HPUB command. It is the hot path: called once per published message.
@@ -175,9 +236,11 @@ private static void DeliverMessage(Subscription sub, string subject, string? rep
}
```
`MessageCount` is incremented atomically before the send. If it exceeds `MaxMessages` (set by an UNSUB with a message count argument), the message is silently dropped. The subscription itself is not removed here — removal happens when the client processes the count limit through `ProcessUnsub`, or when the client disconnects and `RemoveAllSubscriptions` is called.
`MessageCount` is incremented atomically before the send. If it exceeds `MaxMessages` (set by an UNSUB with a message count argument), the subscription is removed from the trie immediately (`subList.Remove(sub)`) and from the client's tracking table (`client.RemoveSubscription(sub.Sid)`), then the message is dropped without delivery.
`SendMessageAsync` is again fire-and-forget. Multiple deliveries to different clients happen concurrently.
`SendMessage` enqueues the serialized wire bytes on the client's outbound channel. Multiple deliveries to different clients happen concurrently.
After local delivery, `ProcessMessage` forwards to the JetStream publisher first: if the subject matches a configured stream, `TryCaptureJetStreamPublish` stores the message and the `PubAck` is sent back to the publisher via `sender.RecordJetStreamPubAck`. Route forwarding is handled separately by `OnLocalSubscription`, which calls `_routeManager?.PropagateLocalSubscription` when a new subscription is added — keeping remote peers informed of local interest without re-routing individual messages inside `ProcessMessage`.
## Client Removal
@@ -193,17 +256,34 @@ public void RemoveClient(NatsClient client)
## Shutdown and Dispose
Graceful shutdown is initiated by `ShutdownAsync`. It uses `_quitCts` — a `CancellationTokenSource` shared between `StartAsync` and all subsystem managers — to signal all internal loops to stop:
```csharp
public void Dispose()
public async Task ShutdownAsync()
{
_listener?.Dispose();
foreach (var client in _clients.Values)
client.Dispose();
_subList.Dispose();
if (Interlocked.CompareExchange(ref _shutdown, 1, 0) != 0)
return; // Already shutting down
// Signal all internal loops to stop
await _quitCts.CancelAsync();
// Close listeners to stop accept loops
_listener?.Close();
_wsListener?.Close();
if (_routeManager != null) await _routeManager.DisposeAsync();
if (_gatewayManager != null) await _gatewayManager.DisposeAsync();
if (_leafNodeManager != null) await _leafNodeManager.DisposeAsync();
if (_jetStreamService != null) await _jetStreamService.DisposeAsync();
// Wait for accept loops to exit, flush and close clients, drain active tasks...
if (_monitorServer != null) await _monitorServer.DisposeAsync();
_shutdownComplete.TrySetResult();
}
```
Disposing the listener socket causes `AcceptAsync` to throw, which unwinds `StartAsync`. Client sockets are disposed, which closes their `NetworkStream` and causes their read loops to terminate. `SubList.Dispose` releases its `ReaderWriterLockSlim`.
Lame-duck mode is a two-phase variant initiated by `LameDuckShutdownAsync`. The `_lameDuck` field (checked via `IsLameDuckMode`) is set first, which stops the accept loops from receiving new connections while existing clients are given a grace period (`options.LameDuckGracePeriod`) to disconnect naturally. After the grace period, remaining clients are stagger-closed over `options.LameDuckDuration` to avoid a thundering herd of reconnects, then `ShutdownAsync` completes the teardown.
`Dispose` is a synchronous fallback. If `ShutdownAsync` has not already run, it blocks on it. It then disposes `_quitCts`, `_tlsRateLimiter`, the listener sockets, all subsystem managers (route, gateway, leaf node, JetStream), all connected clients, and all accounts. PosixSignalRegistrations are also disposed, deregistering the signal handlers.
## Go Reference
@@ -212,6 +292,7 @@ The Go counterpart is `golang/nats-server/server/server.go`. Key differences in
- Go uses goroutines for the accept loop and per-client read/write loops; the .NET port uses `async`/`await` with `Task`.
- Go uses `sync/atomic` for client ID generation; the .NET port uses `Interlocked.Increment`.
- Go passes the server to clients via the `srv` field on the client struct; the .NET port uses the `IMessageRouter` interface through the `Router` property.
- POSIX signal handlers — `SIGTERM`/`SIGQUIT` for shutdown, `SIGHUP` for config reload, `SIGUSR1` for log file reopen, `SIGUSR2` for lame-duck mode — are registered in `HandleSignals` via `PosixSignalRegistration.Create`. `SIGUSR1` and `SIGUSR2` are skipped on Windows. Registrations are stored in `_signalRegistrations` and disposed during `Dispose`.
## Related Documentation
@@ -220,4 +301,4 @@ The Go counterpart is `golang/nats-server/server/server.go`. Key differences in
- [Protocol Overview](../Protocol/Overview.md)
- [Configuration](../Configuration/Overview.md)
<!-- Last verified against codebase: 2026-02-22 -->
<!-- Last verified against codebase: 2026-02-23 -->