Covers /varz, /connz endpoints via Kestrel Minimal APIs, full TLS support with four modes (none/required/first/mixed), cert pinning, rate limiting, and testing strategy.
10 KiB
Monitoring HTTP & TLS Support Design
Date: 2026-02-22
Scope: Port monitoring endpoints (/varz, /connz) and full TLS support from Go NATS server
Go Reference: golang/nats-server/server/monitor.go, server.go (TLS), client.go (TLS), opts.go
Overview
Two features ported from Go NATS:
- Monitoring HTTP — Kestrel Minimal API embedded in
NatsServer, serving/varz,/connz,/healthzand stub endpoints. Exact Go JSON schema for tooling compatibility. - TLS Support —
SslStreamwrapping with four modes: no TLS, TLS required, TLS-first, and mixed TLS/plaintext. Certificate pinning, client cert verification, rate limiting.
1. Server-Level Stats Aggregation
New ServerStats class with atomic counters, replacing the need to sum across all clients on each /varz request.
ServerStats Fields
// src/NATS.Server/ServerStats.cs
public sealed class ServerStats
{
public long InMsgs;
public long OutMsgs;
public long InBytes;
public long OutBytes;
public long TotalConnections;
public long SlowConsumers;
public long StaleConnections;
public long Stalls;
public long SlowConsumerClients;
public long SlowConsumerRoutes;
public long SlowConsumerLeafs;
public long SlowConsumerGateways;
public readonly ConcurrentDictionary<string, long> HttpReqStats = new();
}
Integration Points
NatsServerowns aServerStatsinstance, passes it to eachNatsClientNatsClient.ProcessPubincrements server-levelInMsgs/InBytesalongside client-level countersNatsClient.SendMessageAsyncincrements server-levelOutMsgs/OutBytes- Accept loop increments
TotalConnections NatsServer.StartTimefield added (set once at startup)
2. Monitoring HTTP Endpoints
HTTP Stack
Kestrel Minimal APIs via FrameworkReference to Microsoft.AspNetCore.App. No NuGet packages needed.
Endpoints
| Path | Handler | Description |
|---|---|---|
/ |
HandleRoot |
Links to all endpoints |
/varz |
HandleVarz |
Server stats and config |
/connz |
HandleConnz |
Connection info (paginated) |
/healthz |
HandleHealthz |
Health check (200 OK) |
/routez |
stub | Returns {} |
/gatewayz |
stub | Returns {} |
/leafz |
stub | Returns {} |
/subz |
stub | Returns {} |
/accountz |
stub | Returns {} |
/jsz |
stub | Returns {} |
All paths support optional base path prefix via MonitorBasePath config.
Configuration
// Added to NatsOptions
public int MonitorPort { get; set; } // 0 = disabled, CLI: -m
public string MonitorHost { get; set; } = "0.0.0.0";
public string? MonitorBasePath { get; set; }
public int MonitorHttpsPort { get; set; } // 0 = disabled
Varz Model
Exact Go JSON field names. All fields from Go's Varz struct including nested config structs (ClusterOptsVarz, GatewayOptsVarz, LeafNodeOptsVarz, MqttOptsVarz, WebsocketOptsVarz, JetStreamVarz). Nested structs return defaults/zeros until those subsystems are ported.
Key field categories: identification, network config, security/limits, timing/lifecycle, runtime metrics (mem, CPU, cores), connection stats, message stats, health counters, subsystem configs, HTTP request stats.
Connz Model
Paginated connection list with query parameter support:
sort— sort field (cid, bytes_to, msgs_to, etc.)subs/subs=detail— include subscription listsoffset/limit— pagination (default limit 1024)state— filter open/closed/allauth— include usernames
ConnInfo includes all Go fields: cid, kind, ip, port, start, last_activity, rtt, uptime, idle, pending, msg/byte stats, subscription count, client name/lang/version, TLS version/cipher, account.
Concurrency
HandleVarzacquires aSemaphoreSlim(1,1)to serialize JSON building (matches Go'svarzMu)HandleConnzsnapshots_clients.Values.ToArray()to avoid holding the dictionary during serialization- CPU percentage sampled via
Process.TotalProcessorTimedelta, cached for 1 second
NatsClient Additions for ConnInfo
public DateTime StartTime { get; } // set in constructor
public DateTime LastActivity; // updated on every command dispatch
public string? RemoteIp { get; } // from socket.RemoteEndPoint
public int RemotePort { get; } // from socket.RemoteEndPoint
3. TLS Support
Configuration
// Added to NatsOptions
public string? TlsCert { get; set; }
public string? TlsKey { get; set; }
public string? TlsCaCert { get; set; }
public bool TlsVerify { get; set; }
public bool TlsMap { get; set; }
public double TlsTimeout { get; set; } = 2.0;
public bool TlsHandshakeFirst { get; set; }
public TimeSpan TlsHandshakeFirstFallback { get; set; } = TimeSpan.FromMilliseconds(50);
public bool AllowNonTls { get; set; }
public long TlsRateLimit { get; set; }
public HashSet<string>? TlsPinnedCerts { get; set; }
public SslProtocols TlsMinVersion { get; set; } = SslProtocols.Tls12;
CLI args: --tls, --tlscert, --tlskey, --tlscacert, --tlsverify
INFO Message Changes
Three new fields on ServerInfo: tls_required, tls_verify, tls_available.
tls_required = (TlsConfig != null && !AllowNonTls)tls_verify = (TlsConfig != null && TlsVerify)tls_available = (TlsConfig != null && AllowNonTls)
Four TLS Modes
Mode 1: No TLS — current behavior, unchanged.
Mode 2: TLS Required — send INFO with tls_required=true, client initiates TLS, server detects 0x16 byte, performs SslStream handshake, validates pinned certs, continues protocol over encrypted stream.
Mode 3: TLS First — do NOT send INFO, wait up to 50ms for data. If 0x16 byte arrives: TLS handshake then send INFO over encrypted stream. If timeout or non-TLS byte: fallback to Mode 2 flow.
Mode 4: Mixed — send INFO with tls_available=true, peek first byte. 0x16 → TLS handshake. Other → continue plaintext.
Key Components
TlsHelper — static class for cert loading (X509Certificate2 from PEM/PFX), CA cert loading, building SslServerAuthenticationOptions, pinned cert validation (SHA256 of SubjectPublicKeyInfo).
TlsConnectionWrapper — per-connection negotiation state machine. Takes socket + options, returns (Stream stream, bool infoAlreadySent). Handles peek logic, timeout, handshake, cert validation.
PeekableStream — wraps NetworkStream, buffers peeked bytes, replays them on first ReadAsync. Required so SslStream.AuthenticateAsServerAsync sees the full TLS ClientHello including the peeked byte.
TlsRateLimiter — token-bucket rate limiter. Refills TlsRateLimit tokens per second. WaitAsync blocks if no tokens. Only applies to TLS handshakes, not plain connections.
TlsConnectionState — post-handshake record: TlsVersion, CipherSuite, PeerCert. Stored on NatsClient for /connz reporting.
NatsClient Changes
Constructor takes Stream instead of building NetworkStream internally. TLS negotiation happens before NatsClient is constructed. NatsClient receives the already-negotiated stream and TlsConnectionState.
Accept Loop Changes
Accept socket
→ Increment TotalConnections
→ Rate limit check (if TLS configured)
→ TlsConnectionWrapper.NegotiateAsync (returns stream + infoAlreadySent)
→ Extract TlsConnectionState from SslStream if applicable
→ Construct NatsClient with stream + tlsState
→ client.InfoAlreadySent flag set if TLS-first sent INFO during negotiation
→ RunClientAsync
4. File Layout
src/NATS.Server/
ServerStats.cs
Monitoring/
MonitorServer.cs # Kestrel host, route registration
Varz.cs # Varz + nested config structs
Connz.cs # Connz, ConnInfo, ConnzOptions, SubDetail
VarzHandler.cs # Snapshot logic, CPU/mem sampling
ConnzHandler.cs # Query param parsing, sort, pagination
Tls/
TlsHelper.cs # Cert loading, auth options builder
TlsConnectionWrapper.cs # Per-connection TLS negotiation
TlsConnectionState.cs # Post-handshake state record
TlsRateLimiter.cs # Token-bucket rate limiter
PeekableStream.cs # Buffered-peek stream wrapper
Package Dependencies
FrameworkReferencetoMicrosoft.AspNetCore.AppinNATS.Server.csproj(for Kestrel)- No new NuGet packages —
SslStream,X509Certificate2,SslServerAuthenticationOptionsall inSystem.Net.Security - Tests use
HttpClient(built-in) andCertificateRequest(built-in) for self-signed test certs
5. Testing Strategy
Monitoring Tests (MonitorTests.cs)
/varzreturns correct server identity, config limits, zero stats on fresh server- After pub/sub traffic: message/byte counters are accurate
/connzpagination:?limit=2&offset=0with 5 clients returns 2, total=5/connz?sort=bytes_toordering/connz?subs=trueincludes subscription subjects/healthzreturns 200- HTTP request stats tracked in
/varzresponse
TLS Tests (TlsTests.cs)
Self-signed certs generated in-memory via CertificateRequest + RSA.Create().
- Basic TLS: server cert, client connects with SslStream, pub/sub works
- TLS Required: plaintext client rejected
- TLS Verify: valid client cert succeeds, wrong cert fails
- Mixed mode: TLS and plaintext clients coexist
- TLS First: immediate TLS handshake without reading INFO first
- TLS First fallback: slow client gets INFO sent, normal negotiation
- Certificate pinning: matching cert accepted, non-matching rejected
- Rate limiting: rapid connections throttled
- TLS timeout: incomplete handshake closed after configured timeout
- Integration: NATS.Client.Core NuGet client works over TLS
- Monitoring:
/connzshowstls_versionandtls_cipher_suite
6. Error Handling
- TLS handshake failures are non-fatal: log warning, close socket, increment counter
- Mixed mode byte detection: 0x16 → TLS, printable ASCII → plain, connection close → clean disconnect
- Rate limiter: holds TCP connection open until token available (not rejected)
- Monitoring concurrency:
varzMusemaphore serializes/varz, client snapshot for/connz - CPU sampling: cached 1 second to avoid overhead on rapid polls
- Graceful shutdown:
MonitorServer.DisposeAsync()stops Kestrel, rate limiter disposes timer, in-flight handshakes cancelled via CancellationToken