8.5 KiB
Sections 7-10 Gaps Design: Monitoring, TLS, Logging, Ping/Pong
Date: 2026-02-23 Scope: Implement remaining gaps in differences.md sections 7 (Monitoring), 8 (TLS), 9 (Logging), 10 (Ping/Pong) Goal: Go parity for all features within scope
Section 7: Monitoring
7a. /subz Endpoint
Replace the empty stub with a full SubszHandler.
Models:
Subsz— response envelope:Id,Now,SublistStats,Total,Offset,Limit,Subs[]SubszOptions—Offset,Limit,Subscriptions(bool for detail),Account(filter),Test(literal subject filter)- Reuse existing
SubDetailfrom Connz
Algorithm:
- Iterate all accounts (or filter by
Accountparam) - Collect all subscriptions from each account's SubList
- If
Testsubject provided, filter usingSubjectMatch.MatchLiteral()to only return subs that would receive that message - Apply pagination (offset/limit)
- If
Subscriptionsis true, includeSubDetail[]array
SubList stats — add a Stats() method to SubList returning SublistStats (count, cache size, inserts, removes, matches, cache hits).
Files: New Monitoring/SubszHandler.cs, Monitoring/Subsz.cs. Modify MonitorServer.cs, SubList.cs.
7b. Connz ByStop / ByReason Sorting
Add two missing sort options for closed connection queries.
- Add
ByStopandByReasontoSortOptenum - Parse
sort=stopandsort=reasonin query params - Validate: these sorts only work with
state=closed— return error if used with open connections
7c. Connz State Filtering & Closed Connections
Track closed connections and support state-based filtering.
Closed connection tracking:
ClosedClientrecord:Cid,Ip,Port,Start,Stop,Reason,Name,Lang,Version,InMsgs,OutMsgs,InBytes,OutBytes,NumSubs,Rtt,TlsVersion,TlsCipherSuiteConcurrentQueue<ClosedClient>onNatsServer(capped at 10,000 entries)- Populate in
RemoveClient()from client state before disposal
State filter:
- Parse
state=open|closed|allquery param open(default): current live connections onlyclosed: only from closed connections listall: merge both
Files: Modify NatsServer.cs, ConnzHandler.cs, new Monitoring/ClosedClient.cs.
7d. Varz Slow Consumer Stats
Already at parity. SlowConsumersStats is populated from ServerStats counters. No changes needed.
Section 8: TLS
8a. TLS Rate Limiting
Already implemented via TlsRateLimiter (semaphore + periodic refill timer). Wired into AcceptClientAsync. Only a unit test needed.
8b. TLS Cert-to-User Mapping (TlsMap)
Full DN parsing using .NET built-in X500DistinguishedName.
New TlsMapAuthenticator:
- Implements
IAuthenticator - Receives the list of configured
Userobjects - On
Authenticate():- Extract
X509Certificate2from auth context (passed fromTlsConnectionState) - Parse subject DN via
cert.SubjectName(X500DistinguishedName) - Build normalized DN string from RDN components
- Try exact DN match against user map (key = DN string)
- If no exact match, try CN-only match
- Return
AuthResultwith matched user's permissions
- Extract
Auth context extension:
- Add
X509Certificate2? ClientCertificatetoClientAuthContext - Pass certificate from
TlsConnectionStateinProcessConnectAsync
AuthService integration:
- When
options.TlsMap && options.TlsVerify, addTlsMapAuthenticatorto authenticator chain - TlsMap auth runs before other authenticators (cert-based auth takes priority)
Files: New Auth/TlsMapAuthenticator.cs. Modify Auth/AuthService.cs, Auth/ClientAuthContext.cs, NatsClient.cs.
Section 9: Logging
9a. File Logging with Rotation
New options on NatsOptions:
LogFile(string?) — path to log fileLogSizeLimit(long) — file size in bytes before rotation (0 = unlimited)LogMaxFiles(int) — max retained rotated files (0 = unlimited)
CLI flags: --log_file, --log_size_limit, --log_max_files
Serilog config: Add WriteTo.File() with fileSizeLimitBytes and retainedFileCountLimit when LogFile is set.
9b. Debug/Trace Modes
New options on NatsOptions:
Debug(bool) — enable debug-level loggingTrace(bool) — enable trace/verbose-level logging
CLI flags: -D (debug), -V or -T (trace), -DV (both)
Serilog config:
- Default:
MinimumLevel.Information() -D:MinimumLevel.Debug()-V/-T:MinimumLevel.Verbose()
9c. Color Output
Auto-detect TTY via Console.IsOutputRedirected.
- TTY: use
Serilog.Sinks.ConsolewithAnsiConsoleTheme.Code - Non-TTY: use
ConsoleTheme.None
Matches Go's behavior of disabling color when stderr is not a terminal.
9d. Timestamp Format Control
New options on NatsOptions:
Logtime(bool, default true) — include timestampsLogtimeUTC(bool, default false) — use UTC format
CLI flags: --logtime (true/false), --logtime_utc
Output template adjustment:
- With timestamps:
[{Timestamp:yyyy/MM/dd HH:mm:ss.ffffff} {Level:u3}] {Message:lj}{NewLine}{Exception} - Without timestamps:
[{Level:u3}] {Message:lj}{NewLine}{Exception} - UTC: set
Serilog.Formattingculture to UTC
9e. Log Reopening (SIGUSR1)
When file logging is configured:
- SIGUSR1 handler calls
ReOpenLogFile()on the server ReOpenLogFile()flushes and closes current Serilog logger, creates new one with same config- This enables external log rotation tools (logrotate)
Files: Modify NatsOptions.cs, Program.cs, NatsServer.cs.
Section 10: Ping/Pong
10a. RTT Tracking
New fields on NatsClient:
_rttStartTicks(long) — UTC ticks when PING sent_rtt(long) — computed RTT in ticksRttproperty (TimeSpan) — computed from_rtt
Logic:
- In
RunPingTimerAsync, before writing PING:_rttStartTicks = DateTime.UtcNow.Ticks - In
DispatchCommandAsyncPONG handler: compute_rtt = DateTime.UtcNow.Ticks - _rttStartTicks(min 1 tick) computeRTT()helper ensures minimum 1 tick (handles clock granularity on Windows)
Monitoring exposure:
- Populate
ConnInfo.Rttas formatted string (e.g.,"1.234ms") - Add
ByRttsort option to Connz
10b. RTT-Based First PING Delay
New state on NatsClient:
_firstPongSentflag inClientFlags
Logic in RunPingTimerAsync:
- Before first PING, check:
_firstPongSent || timeSinceStart > 2 seconds - If neither condition met, skip this PING cycle
- Set
_firstPongSenton first PONG after CONNECT (in PONG handler)
This prevents the server from sending PING (for RTT) before the client has had a chance to respond to the initial INFO with CONNECT+PING.
10c. Stale Connection Stats
New model:
StaleConnectionStats—Clients,Routes,Gateways,Leafs(matching Go)
ServerStats extension:
- Add
StaleConnectionClients,StaleConnectionRoutes, etc. fields - Increment in
MarkClosed(StaleConnection)based on connection kind
Varz exposure:
- Add
StaleConnectionStatsfield toVarz - Populate from
ServerStatscounters
Files: Modify NatsClient.cs, ServerStats.cs, Varz.cs, VarzHandler.cs, Connz.cs, ConnzHandler.cs.
Test Coverage
Each section includes unit tests:
| Feature | Test File | Tests |
|---|---|---|
| Subz endpoint | SubszHandlerTests.cs | Empty response, with subs, account filter, test subject filter, pagination |
| Connz closed state | ConnzHandlerTests.cs | State=closed, ByStop sort, ByReason sort, validation errors |
| TLS rate limiter | TlsRateLimiterTests.cs | Rate enforcement, refill behavior |
| TlsMap auth | TlsMapAuthenticatorTests.cs | DN matching, CN fallback, no match |
| File logging | LoggingTests.cs | File creation, rotation on size limit |
| RTT tracking | ClientTests.cs | RTT computed on PONG, exposed in connz, ByRtt sort |
| First PING delay | ClientTests.cs | PING delayed until first PONG or 2s |
| Stale stats | ServerTests.cs | Stale counters incremented, exposed in varz |
Parallelization Strategy
These work streams are independent and can be developed by parallel subagents:
- Monitoring stream (7a, 7b, 7c): SubszHandler + Connz closed connections + state filter
- TLS stream (8b): TlsMapAuthenticator
- Logging stream (9a-9e): All logging improvements
- Ping/Pong stream (10a-10c): RTT tracking + first PING delay + stale stats
Streams 1-4 touch different files with minimal overlap. The only shared touch point is NatsOptions.cs (new options for logging and ping/pong), which can be handled by one stream first and the others will build on it.