Files
Joseph Doherty e553db6d40 docs: add Authentication, Clustering, JetStream, Monitoring overviews; update existing docs
New files:
- Documentation/Authentication/Overview.md — all 7 auth mechanisms with real source
  snippets (NKey/JWT/username-password/token/TLS mapping), nonce generation, account
  system, permissions, JWT permission templates
- Documentation/Clustering/Overview.md — route TCP handshake, in-process subscription
  propagation, gateway/leaf node stubs, honest gaps list
- Documentation/JetStream/Overview.md — API surface (4 handled subjects), streams,
  consumers, storage (MemStore/FileStore), in-process RAFT, mirror/source, gaps list
- Documentation/Monitoring/Overview.md — all 12 endpoints with real field tables,
  Go compatibility notes

Updated files:
- GettingStarted/Architecture.md — 14-subdirectory tree, real NatsClient/NatsServer
  field snippets, 9 new Go reference rows, Channel write queue design choice
- GettingStarted/Setup.md — xUnit 3, 100 test files grouped by area
- Operations/Overview.md — 99 test files, accurate Program.cs snippet, limitations
  section renamed to "Known Gaps vs Go Reference" with 7 real gaps
- Server/Overview.md — grouped fields, TLS/WS accept path, lame-duck mode, POSIX signals
- Configuration/Overview.md — 14 subsystem option tables, 24-row CLI table, LogOverrides
- Server/Client.md — Channel write queue, 4-task RunAsync, CommandMatrix, real fields

All docs verified against codebase 2026-02-23; 713 tests pass.
2026-02-23 10:14:18 -05:00

17 KiB

Monitoring Overview

The monitoring subsystem exposes an HTTP server that reports server state, connection details, subscription counts, and JetStream statistics. It is the .NET port of the monitoring endpoints in golang/nats-server/server/monitor.go.

Enabling Monitoring

Monitoring is disabled by default. Set MonitorPort to a non-zero value to enable it. The standard NATS monitoring port is 8222.

Configuration options

NatsOptions field CLI flag Default Description
MonitorPort -m / --http_port 0 (disabled) HTTP port for the monitoring server
MonitorHost (none) "0.0.0.0" Address the monitoring server binds to
MonitorBasePath --http_base_path "" URL prefix prepended to all endpoint paths
MonitorHttpsPort --https_port 0 (disabled) HTTPS port (reported in /varz; HTTPS listener not yet implemented)

Starting with a custom port:

dotnet run --project src/NATS.Server.Host -- -m 8222

With a base path (all endpoints become /monitor/varz, /monitor/connz, etc.):

dotnet run --project src/NATS.Server.Host -- -m 8222 --http_base_path /monitor

MonitorServer startup

MonitorServer uses WebApplication.CreateSlimBuilder — the minimal ASP.NET Core host, without MVC or Razor, with no extra middleware. Logging providers are cleared so monitoring HTTP request logs do not appear in the NATS server's Serilog output. The actual ILogger<MonitorServer> logger is used only for the startup confirmation message.

public MonitorServer(NatsServer server, NatsOptions options, ServerStats stats, ILoggerFactory loggerFactory)
{
    _logger = loggerFactory.CreateLogger<MonitorServer>();

    var builder = WebApplication.CreateSlimBuilder();
    builder.WebHost.UseUrls($"http://{options.MonitorHost}:{options.MonitorPort}");
    builder.Logging.ClearProviders();

    _app = builder.Build();
    var basePath = options.MonitorBasePath ?? "";

    _varzHandler = new VarzHandler(server, options);
    _connzHandler = new ConnzHandler(server);
    _subszHandler = new SubszHandler(server);
    _jszHandler = new JszHandler(server, options);
    // ... endpoint registration follows
}

public async Task StartAsync(CancellationToken ct)
{
    await _app.StartAsync(ct);
    _logger.LogInformation("Monitoring listening on {Urls}", string.Join(", ", _app.Urls));
}

MonitorServer is IAsyncDisposable. DisposeAsync stops the web application and disposes the VarzHandler (which holds a SemaphoreSlim).

Architecture

Endpoint-to-handler mapping

Path Handler Status
GET / Inline lambda Implemented
GET /healthz Inline lambda Implemented
GET /varz VarzHandler.HandleVarzAsync Implemented
GET /connz ConnzHandler.HandleConnz Implemented
GET /subz SubszHandler.HandleSubsz Implemented
GET /subscriptionsz SubszHandler.HandleSubsz Implemented (alias for /subz)
GET /jsz JszHandler.Build Implemented (summary only)
GET /routez Inline lambda Stub — returns {}
GET /gatewayz Inline lambda Stub — returns {}
GET /leafz Inline lambda Stub — returns {}
GET /accountz Inline lambda Stub — returns {}
GET /accstatz Inline lambda Stub — returns {}

All endpoints are registered with MonitorBasePath prepended when set.

Request counting

Every endpoint increments ServerStats.HttpReqStats — a ConcurrentDictionary<string, long> — using AddOrUpdate. The path string (e.g., "/varz") is the key. These counts are included in /varz responses as the http_req_stats field, allowing external tooling to track monitoring traffic over time.

// ServerStats.cs
public readonly ConcurrentDictionary<string, long> HttpReqStats = new();

// MonitorServer.cs — pattern used for every endpoint
stats.HttpReqStats.AddOrUpdate("/varz", 1, (_, v) => v + 1);

Endpoints

GET /

Returns a JSON object listing the available endpoint paths. The list is static and does not reflect which endpoints are currently implemented.

{
  "endpoints": [
    "/varz", "/connz", "/healthz", "/routez",
    "/gatewayz", "/leafz", "/subz", "/accountz", "/jsz"
  ]
}

GET /healthz

Returns HTTP 200 with the plain text body "ok". This is a liveness probe: if the monitoring HTTP server responds, the process is alive. It does not check message delivery, subscription state, or JetStream health.

GET /varz

Returns a Varz JSON object containing server identity, configuration limits, runtime metrics, and traffic counters. The response is built by VarzHandler.HandleVarzAsync, which holds a SemaphoreSlim (_varzMu) to serialize concurrent requests.

CPU sampling

CPU usage is calculated by comparing Process.TotalProcessorTime samples. Results are cached for one second; requests within that window return the previous sample.

// VarzHandler.cs
if ((now - _lastCpuSampleTime).TotalSeconds >= 1.0)
{
    var currentCpu = proc.TotalProcessorTime;
    var elapsed = now - _lastCpuSampleTime;
    _cachedCpuPercent = (currentCpu - _lastCpuUsage).TotalMilliseconds
        / elapsed.TotalMilliseconds / Environment.ProcessorCount * 100.0;
    _lastCpuSampleTime = now;
    _lastCpuUsage = currentCpu;
}

The value is divided by Environment.ProcessorCount to produce a per-core percentage and then rounded to two decimal places.

TLS certificate expiry

When HasTls is true and TlsCert is set, the handler loads the certificate file with X509CertificateLoader.LoadCertificateFromFile and reads NotAfter. Load failures are silently swallowed; the field defaults to DateTime.MinValue in that case.

Field reference

Identity

JSON key C# property Description
server_id Id 20-char uppercase alphanumeric server ID
server_name Name Server name from options or generated default
version Version Protocol version string
proto Proto Protocol version integer
go GoVersion Reports "dotnet {RuntimeInformation.FrameworkDescription}"
host Host Bound client host
port Port Bound client port
git_commit GitCommit Always empty in this port

Network

JSON key C# property Description
ip Ip Resolved IP (empty if not set)
connect_urls ConnectUrls Advertised client URLs
ws_connect_urls WsConnectUrls Advertised WebSocket URLs
http_host HttpHost Monitoring bind host
http_port HttpPort Monitoring HTTP port
http_base_path HttpBasePath Monitoring base path
https_port HttpsPort Monitoring HTTPS port

Security

JSON key C# property Description
auth_required AuthRequired Whether auth is required
tls_required TlsRequired HasTls && !AllowNonTls
tls_verify TlsVerify Client certificate verification
tls_ocsp_peer_verify TlsOcspPeerVerify OCSP peer verification
auth_timeout AuthTimeout Auth timeout in seconds
tls_timeout TlsTimeout TLS handshake timeout in seconds
tls_cert_not_after TlsCertNotAfter TLS certificate expiry date

Limits

JSON key C# property Description
max_connections MaxConnections Max simultaneous connections
max_subscriptions MaxSubscriptions Max subscriptions (0 = unlimited)
max_payload MaxPayload Max message payload in bytes
max_pending MaxPending Max pending bytes per client
max_control_line MaxControlLine Max control line length in bytes
ping_max MaxPingsOut Max outstanding pings before disconnect

Timing

JSON key C# property Type Description
ping_interval PingInterval long (nanoseconds) Ping send interval
write_deadline WriteDeadline long (nanoseconds) Write deadline
start Start DateTime Server start time
now Now DateTime Time of this response
uptime Uptime string Human-readable uptime (e.g., "2d4h30m10s")
config_load_time ConfigLoadTime DateTime Currently set to server start time

Runtime

JSON key C# property Description
mem Mem Process working set in bytes
cpu Cpu CPU usage percentage (1-second cache)
cores Cores Environment.ProcessorCount
gomaxprocs MaxProcs ThreadPool.ThreadCount

Traffic and connections

JSON key C# property Description
connections Connections Current open client count
total_connections TotalConnections Cumulative connections since start
routes Routes Current cluster route count
remotes Remotes Remote cluster count
leafnodes Leafnodes Leaf node count
in_msgs InMsgs Total messages received
out_msgs OutMsgs Total messages sent
in_bytes InBytes Total bytes received
out_bytes OutBytes Total bytes sent
slow_consumers SlowConsumers Slow consumer disconnect count
slow_consumer_stats SlowConsumerStats Breakdown by connection type
stale_connections StaleConnections Stale connection count
stale_connection_stats StaleConnectionStatsDetail Breakdown by connection type
subscriptions Subscriptions Current subscription count

HTTP

JSON key C# property Description
http_req_stats HttpReqStats Per-path request counts since start

Subsystems

JSON key C# property Type
cluster Cluster ClusterOptsVarz
gateway Gateway GatewayOptsVarz
leaf Leaf LeafNodeOptsVarz
mqtt Mqtt MqttOptsVarz
websocket Websocket WebsocketOptsVarz
jetstream JetStream JetStreamVarz

The JetStreamVarz object contains a config object (JetStreamConfig) with max_memory, max_storage, and store_dir, and a stats object (JetStreamStats) with accounts, ha_assets, streams, consumers, and an api sub-object with total and errors.

GET /connz

Returns a Connz JSON object with a paged list of connection details. Handled by ConnzHandler.HandleConnz.

Query parameters

Parameter Values Default Description
sort cid, start, subs, pending, msgs_to, msgs_from, bytes_to, bytes_from, last, idle, uptime, rtt, stop, reason cid Sort order; stop and reason are silently coerced to cid when state=open
subs true, 1, detail (omitted) Include subscription list; detail adds per-subscription message counts and queue group names
state open, closed, all open Which connections to include
offset integer 0 Pagination offset
limit integer 1024 Max connections per response
mqtt_client string (omitted) Filter to a specific MQTT client ID

Response shape

{
  "server_id": "NABCDEFGHIJ1234567890",
  "now": "2026-02-23T12:00:00Z",
  "num_connections": 2,
  "total": 2,
  "offset": 0,
  "limit": 1024,
  "connections": [
    {
      "cid": 1,
      "kind": "Client",
      "type": "Client",
      "ip": "127.0.0.1",
      "port": 52100,
      "start": "2026-02-23T11:55:00Z",
      "last_activity": "2026-02-23T11:59:50Z",
      "uptime": "5m0s",
      "idle": "10s",
      "pending_bytes": 0,
      "in_msgs": 100,
      "out_msgs": 50,
      "in_bytes": 4096,
      "out_bytes": 2048,
      "subscriptions": 3,
      "name": "my-client",
      "lang": "go",
      "version": "1.20.0",
      "rtt": "1.234ms"
    }
  ]
}

When subs=true, ConnInfo includes subscriptions_list: string[]. When subs=detail, it includes subscriptions_list_detail: SubDetail[] where each entry has subject, qgroup, sid, msgs, max, and cid.

Closed connection tracking

NatsServer maintains a bounded ring buffer of ClosedClient records (capacity set by NatsOptions.MaxClosedClients, default 10_000). When a client disconnects, a ClosedClient record is captured with the final counters, timestamps, and disconnect reason. These records are included when state=closed or state=all.

ClosedClient is a sealed record with init-only properties:

public sealed record ClosedClient
{
    public required ulong Cid { get; init; }
    public string Ip { get; init; } = "";
    public int Port { get; init; }
    public DateTime Start { get; init; }
    public DateTime Stop { get; init; }
    public string Reason { get; init; } = "";
    public long InMsgs { get; init; }
    public long OutMsgs { get; init; }
    // ... additional fields
}

GET /subz and GET /subscriptionsz

Both paths are handled by SubszHandler.HandleSubsz. Returns a Subsz JSON object with subscription counts and an optional subscription listing.

Query parameters

Parameter Values Default Description
subs true, 1, detail (omitted) Include individual subscription records
offset integer 0 Pagination offset into the subscription list
limit integer 1024 Max subscriptions returned
acc account name (omitted) Restrict results to a single account
test subject literal (omitted) Filter to subscriptions that match this literal subject

$SYS account exclusion

When acc is not specified, the $SYS system account is excluded from results. Its subscriptions are internal infrastructure (server event routing) and are not user-visible. To inspect $SYS subscriptions explicitly, pass acc=$SYS.

// SubszHandler.cs
if (string.IsNullOrEmpty(opts.Account) && account.Name == "$SYS")
    continue;

Cache fields

num_cache in the response is the sum of SubList.CacheCount across all included accounts. This reflects the number of cached Match() results currently held in the subscription trie. It is informational — a high cache count is normal and expected after traffic warms up the cache.

Response shape

{
  "server_id": "NABCDEFGHIJ1234567890",
  "now": "2026-02-23T12:00:00Z",
  "num_subscriptions": 42,
  "num_cache": 18,
  "total": 42,
  "offset": 0,
  "limit": 1024,
  "subscriptions": []
}

When subs=true or subs=1, the subscriptions array is populated with SubDetail objects:

{
  "subject": "orders.>",
  "qgroup": "",
  "sid": "1",
  "msgs": 500,
  "max": 0,
  "cid": 3
}

GET /jsz

Returns a JszResponse JSON object built by JszHandler.Build. Reports whether JetStream is enabled and summarises stream and consumer counts.

{
  "server_id": "NABCDEFGHIJ1234567890",
  "now": "2026-02-23T12:00:00Z",
  "enabled": true,
  "memory": 0,
  "storage": 0,
  "streams": 5,
  "consumers": 12,
  "config": {
    "max_memory": 1073741824,
    "max_storage": 10737418240,
    "store_dir": "/var/nats/jetstream"
  }
}

memory and storage are always 0 in the current implementation — per-stream byte accounting is not yet wired up. streams and consumers reflect live counts from NatsServer.JetStreamStreams and NatsServer.JetStreamConsumers.

For full JetStream documentation see JetStream (when available).

Stub endpoints

The following endpoints exist and respond with HTTP 200 and an empty JSON object ({}). They increment HttpReqStats but return no data. They are placeholders for future implementation once the corresponding subsystems are ported.

Endpoint Subsystem
/routez Cluster routes
/gatewayz Gateways
/leafz Leaf nodes
/accountz Account listing
/accstatz Per-account statistics

Go Compatibility

The JSON shapes are designed to match the Go server's monitoring responses so that existing NATS tooling (e.g., nats-top, Prometheus exporters, Grafana dashboards) works without modification.

Known differences from the Go server:

  • The go field in /varz reports the .NET runtime description (e.g., "dotnet .NET 10.0.0") rather than a Go version string. Tools that parse this field for display only are unaffected; tools that parse it to gate on runtime type will see a different value.
  • /varz config_load_time is currently set to server start time rather than the time the configuration file was last loaded.
  • /varz mem reports Process.WorkingSet64 (the OS working set). The Go server reports heap allocation via runtime.MemStats.HeapInuse. The values are comparable in meaning but not identical.
  • /varz gomaxprocs is mapped to ThreadPool.ThreadCount. The Go field represents the goroutine parallelism limit (GOMAXPROCS); the .NET value represents the current thread pool size, which is a reasonable equivalent.
  • /jsz memory and storage are always 0. The Go server reports actual byte usage per stream.
  • /routez, /gatewayz, /leafz, /accountz, /accstatz return {} instead of structured data.