docs: add Authentication, Clustering, JetStream, Monitoring overviews; update existing docs
New files: - Documentation/Authentication/Overview.md — all 7 auth mechanisms with real source snippets (NKey/JWT/username-password/token/TLS mapping), nonce generation, account system, permissions, JWT permission templates - Documentation/Clustering/Overview.md — route TCP handshake, in-process subscription propagation, gateway/leaf node stubs, honest gaps list - Documentation/JetStream/Overview.md — API surface (4 handled subjects), streams, consumers, storage (MemStore/FileStore), in-process RAFT, mirror/source, gaps list - Documentation/Monitoring/Overview.md — all 12 endpoints with real field tables, Go compatibility notes Updated files: - GettingStarted/Architecture.md — 14-subdirectory tree, real NatsClient/NatsServer field snippets, 9 new Go reference rows, Channel write queue design choice - GettingStarted/Setup.md — xUnit 3, 100 test files grouped by area - Operations/Overview.md — 99 test files, accurate Program.cs snippet, limitations section renamed to "Known Gaps vs Go Reference" with 7 real gaps - Server/Overview.md — grouped fields, TLS/WS accept path, lame-duck mode, POSIX signals - Configuration/Overview.md — 14 subsystem option tables, 24-row CLI table, LogOverrides - Server/Client.md — Channel write queue, 4-task RunAsync, CommandMatrix, real fields All docs verified against codebase 2026-02-23; 713 tests pass.
This commit is contained in:
@@ -31,20 +31,46 @@ Defining them separately makes unit testing straightforward: a test can supply a
|
||||
```csharp
|
||||
public sealed class NatsServer : IMessageRouter, ISubListAccess, IDisposable
|
||||
{
|
||||
private readonly NatsOptions _options;
|
||||
// Client registry
|
||||
private readonly ConcurrentDictionary<ulong, NatsClient> _clients = new();
|
||||
private readonly SubList _subList = new();
|
||||
private readonly ServerInfo _serverInfo;
|
||||
private readonly ILogger<NatsServer> _logger;
|
||||
private readonly ILoggerFactory _loggerFactory;
|
||||
private Socket? _listener;
|
||||
private readonly ConcurrentQueue<ClosedClient> _closedClients = new();
|
||||
private ulong _nextClientId;
|
||||
private int _activeClientCount;
|
||||
|
||||
public SubList SubList => _subList;
|
||||
// Account system
|
||||
private readonly ConcurrentDictionary<string, Account> _accounts = new(StringComparer.Ordinal);
|
||||
private readonly Account _globalAccount;
|
||||
private readonly Account _systemAccount;
|
||||
private AuthService _authService;
|
||||
|
||||
// Subsystem managers (null when not configured)
|
||||
private readonly RouteManager? _routeManager;
|
||||
private readonly GatewayManager? _gatewayManager;
|
||||
private readonly LeafNodeManager? _leafNodeManager;
|
||||
private readonly JetStreamService? _jetStreamService;
|
||||
private readonly JetStreamPublisher? _jetStreamPublisher;
|
||||
private MonitorServer? _monitorServer;
|
||||
|
||||
// TLS / transport
|
||||
private readonly SslServerAuthenticationOptions? _sslOptions;
|
||||
private readonly TlsRateLimiter? _tlsRateLimiter;
|
||||
private Socket? _listener;
|
||||
private Socket? _wsListener;
|
||||
|
||||
// Shutdown coordination
|
||||
private readonly CancellationTokenSource _quitCts = new();
|
||||
private readonly TaskCompletionSource _shutdownComplete = new(TaskCreationOptions.RunContinuationsAsynchronously);
|
||||
private readonly TaskCompletionSource _acceptLoopExited = new(TaskCreationOptions.RunContinuationsAsynchronously);
|
||||
private int _shutdown;
|
||||
private int _lameDuck;
|
||||
|
||||
public SubList SubList => _globalAccount.SubList;
|
||||
}
|
||||
```
|
||||
|
||||
`_clients` tracks every live connection. `_nextClientId` is incremented with `Interlocked.Increment` for each accepted socket, producing monotonically increasing client IDs without a lock. `_loggerFactory` is retained so per-client loggers can be created at accept time, each tagged with the client ID.
|
||||
`_clients` tracks every live connection. `_closedClients` holds a capped ring of recently disconnected client snapshots (used by `/connz`). `_nextClientId` is incremented with `Interlocked.Increment` for each accepted socket, producing monotonically increasing client IDs without a lock. `_loggerFactory` is retained so per-client loggers can be created at accept time, each tagged with the client ID.
|
||||
|
||||
Each subsystem manager field (`_routeManager`, `_gatewayManager`, `_leafNodeManager`, `_jetStreamService`, `_monitorServer`) is `null` when the corresponding options section is absent from the configuration. Code that interacts with these managers always guards with a null check.
|
||||
|
||||
### Constructor
|
||||
|
||||
@@ -70,6 +96,10 @@ public NatsServer(NatsOptions options, ILoggerFactory loggerFactory)
|
||||
|
||||
The `ServerId` is derived from a GUID — taking the first 20 characters of its `"N"` format (32 hex digits, no hyphens) and uppercasing them. This matches the fixed-length alphanumeric server ID format used by the Go server.
|
||||
|
||||
Subsystem managers are instantiated in the constructor if the corresponding options sections are non-null: `options.Cluster != null` creates a `RouteManager`, `options.Gateway != null` creates a `GatewayManager`, `options.LeafNode != null` creates a `LeafNodeManager`, and `options.JetStream != null` creates `JetStreamService`, `JetStreamApiRouter`, `StreamManager`, `ConsumerManager`, and `JetStreamPublisher`. TLS options are compiled into `SslServerAuthenticationOptions` via `TlsHelper.BuildServerAuthOptions` when `options.HasTls` is true.
|
||||
|
||||
Before entering the accept loop, `StartAsync` starts the monitoring server, WebSocket listener, route connections, gateway connections, leaf node listener, and JetStream service.
|
||||
|
||||
## Accept Loop
|
||||
|
||||
`StartAsync` binds the socket, enables `SO_REUSEADDR` so the port can be reused immediately after a restart, and enters an async accept loop:
|
||||
@@ -103,6 +133,37 @@ public async Task StartAsync(CancellationToken ct)
|
||||
|
||||
The backlog of 128 passed to `Listen` controls the OS-level queue of unaccepted connections — matching the Go server default.
|
||||
|
||||
### TLS wrapping and WebSocket upgrade
|
||||
|
||||
After `AcceptAsync` returns a socket, the connection is handed to `AcceptClientAsync`, which performs transport negotiation before constructing `NatsClient`:
|
||||
|
||||
```csharp
|
||||
private async Task AcceptClientAsync(Socket socket, ulong clientId, CancellationToken ct)
|
||||
{
|
||||
if (_tlsRateLimiter != null)
|
||||
await _tlsRateLimiter.WaitAsync(ct);
|
||||
|
||||
var networkStream = new NetworkStream(socket, ownsSocket: false);
|
||||
|
||||
// TlsConnectionWrapper performs the TLS handshake if _sslOptions is set;
|
||||
// returns the raw NetworkStream unchanged when TLS is not configured.
|
||||
var (stream, infoAlreadySent) = await TlsConnectionWrapper.NegotiateAsync(
|
||||
socket, networkStream, _options, _sslOptions, _serverInfo,
|
||||
_loggerFactory.CreateLogger("NATS.Server.Tls"), ct);
|
||||
|
||||
// ...auth nonce generation, TLS state extraction...
|
||||
|
||||
var client = new NatsClient(clientId, stream, socket, _options, clientInfo,
|
||||
_authService, nonce, clientLogger, _stats);
|
||||
client.Router = this;
|
||||
client.TlsState = tlsState;
|
||||
client.InfoAlreadySent = infoAlreadySent;
|
||||
_clients[clientId] = client;
|
||||
}
|
||||
```
|
||||
|
||||
WebSocket connections follow a parallel path through `AcceptWebSocketClientAsync`. After optional TLS negotiation via `TlsConnectionWrapper`, the HTTP upgrade handshake is performed by `WsUpgrade.TryUpgradeAsync`. On success, the raw stream is wrapped in a `WsConnection` that handles WebSocket framing, masking, and per-message compression before `NatsClient` is constructed.
|
||||
|
||||
## Message Routing
|
||||
|
||||
`ProcessMessage` is called by `NatsClient` for every PUB or HPUB command. It is the hot path: called once per published message.
|
||||
@@ -175,9 +236,11 @@ private static void DeliverMessage(Subscription sub, string subject, string? rep
|
||||
}
|
||||
```
|
||||
|
||||
`MessageCount` is incremented atomically before the send. If it exceeds `MaxMessages` (set by an UNSUB with a message count argument), the message is silently dropped. The subscription itself is not removed here — removal happens when the client processes the count limit through `ProcessUnsub`, or when the client disconnects and `RemoveAllSubscriptions` is called.
|
||||
`MessageCount` is incremented atomically before the send. If it exceeds `MaxMessages` (set by an UNSUB with a message count argument), the subscription is removed from the trie immediately (`subList.Remove(sub)`) and from the client's tracking table (`client.RemoveSubscription(sub.Sid)`), then the message is dropped without delivery.
|
||||
|
||||
`SendMessageAsync` is again fire-and-forget. Multiple deliveries to different clients happen concurrently.
|
||||
`SendMessage` enqueues the serialized wire bytes on the client's outbound channel. Multiple deliveries to different clients happen concurrently.
|
||||
|
||||
After local delivery, `ProcessMessage` forwards to the JetStream publisher first: if the subject matches a configured stream, `TryCaptureJetStreamPublish` stores the message and the `PubAck` is sent back to the publisher via `sender.RecordJetStreamPubAck`. Route forwarding is handled separately by `OnLocalSubscription`, which calls `_routeManager?.PropagateLocalSubscription` when a new subscription is added — keeping remote peers informed of local interest without re-routing individual messages inside `ProcessMessage`.
|
||||
|
||||
## Client Removal
|
||||
|
||||
@@ -193,17 +256,34 @@ public void RemoveClient(NatsClient client)
|
||||
|
||||
## Shutdown and Dispose
|
||||
|
||||
Graceful shutdown is initiated by `ShutdownAsync`. It uses `_quitCts` — a `CancellationTokenSource` shared between `StartAsync` and all subsystem managers — to signal all internal loops to stop:
|
||||
|
||||
```csharp
|
||||
public void Dispose()
|
||||
public async Task ShutdownAsync()
|
||||
{
|
||||
_listener?.Dispose();
|
||||
foreach (var client in _clients.Values)
|
||||
client.Dispose();
|
||||
_subList.Dispose();
|
||||
if (Interlocked.CompareExchange(ref _shutdown, 1, 0) != 0)
|
||||
return; // Already shutting down
|
||||
|
||||
// Signal all internal loops to stop
|
||||
await _quitCts.CancelAsync();
|
||||
|
||||
// Close listeners to stop accept loops
|
||||
_listener?.Close();
|
||||
_wsListener?.Close();
|
||||
if (_routeManager != null) await _routeManager.DisposeAsync();
|
||||
if (_gatewayManager != null) await _gatewayManager.DisposeAsync();
|
||||
if (_leafNodeManager != null) await _leafNodeManager.DisposeAsync();
|
||||
if (_jetStreamService != null) await _jetStreamService.DisposeAsync();
|
||||
|
||||
// Wait for accept loops to exit, flush and close clients, drain active tasks...
|
||||
if (_monitorServer != null) await _monitorServer.DisposeAsync();
|
||||
_shutdownComplete.TrySetResult();
|
||||
}
|
||||
```
|
||||
|
||||
Disposing the listener socket causes `AcceptAsync` to throw, which unwinds `StartAsync`. Client sockets are disposed, which closes their `NetworkStream` and causes their read loops to terminate. `SubList.Dispose` releases its `ReaderWriterLockSlim`.
|
||||
Lame-duck mode is a two-phase variant initiated by `LameDuckShutdownAsync`. The `_lameDuck` field (checked via `IsLameDuckMode`) is set first, which stops the accept loops from receiving new connections while existing clients are given a grace period (`options.LameDuckGracePeriod`) to disconnect naturally. After the grace period, remaining clients are stagger-closed over `options.LameDuckDuration` to avoid a thundering herd of reconnects, then `ShutdownAsync` completes the teardown.
|
||||
|
||||
`Dispose` is a synchronous fallback. If `ShutdownAsync` has not already run, it blocks on it. It then disposes `_quitCts`, `_tlsRateLimiter`, the listener sockets, all subsystem managers (route, gateway, leaf node, JetStream), all connected clients, and all accounts. PosixSignalRegistrations are also disposed, deregistering the signal handlers.
|
||||
|
||||
## Go Reference
|
||||
|
||||
@@ -212,6 +292,7 @@ The Go counterpart is `golang/nats-server/server/server.go`. Key differences in
|
||||
- Go uses goroutines for the accept loop and per-client read/write loops; the .NET port uses `async`/`await` with `Task`.
|
||||
- Go uses `sync/atomic` for client ID generation; the .NET port uses `Interlocked.Increment`.
|
||||
- Go passes the server to clients via the `srv` field on the client struct; the .NET port uses the `IMessageRouter` interface through the `Router` property.
|
||||
- POSIX signal handlers — `SIGTERM`/`SIGQUIT` for shutdown, `SIGHUP` for config reload, `SIGUSR1` for log file reopen, `SIGUSR2` for lame-duck mode — are registered in `HandleSignals` via `PosixSignalRegistration.Create`. `SIGUSR1` and `SIGUSR2` are skipped on Windows. Registrations are stored in `_signalRegistrations` and disposed during `Dispose`.
|
||||
|
||||
## Related Documentation
|
||||
|
||||
@@ -220,4 +301,4 @@ The Go counterpart is `golang/nats-server/server/server.go`. Key differences in
|
||||
- [Protocol Overview](../Protocol/Overview.md)
|
||||
- [Configuration](../Configuration/Overview.md)
|
||||
|
||||
<!-- Last verified against codebase: 2026-02-22 -->
|
||||
<!-- Last verified against codebase: 2026-02-23 -->
|
||||
|
||||
Reference in New Issue
Block a user