# Core Server Lifecycle — Design Implements all gaps from section 1 of `differences.md` (Core Server Lifecycle). Reference: `golang/nats-server/server/server.go`, `client.go`, `signal.go` ## Components ### 1. ClosedState Enum & Close Reason Tracking New file `src/NATS.Server/ClosedState.cs` — full Go enum (37 values from `client.go:188-228`). - `NatsClient` gets `CloseReason` property, `MarkClosed(ClosedState)` method - Close reason set in `RunAsync` finally blocks based on exception type - Error-related reasons (ReadError, WriteError, TLSHandshakeError) skip flush on close - `NatsServer.RemoveClient` logs close reason via structured logging ### 2. Accept Loop Exponential Backoff Port Go's `acceptError` pattern from `server.go:4607-4627`. - Constants: `AcceptMinSleep = 10ms`, `AcceptMaxSleep = 1s` - On `SocketException`: sleep `tmpDelay`, double it, cap at 1s - On success: reset to 10ms - During sleep: check `_quitCts` to abort if shutting down - Non-temporary errors break the loop ### 3. Ephemeral Port (port=0) After `_listener.Bind()` + `Listen()`, resolve actual port: ```csharp if (_options.Port == 0) { var actualPort = ((IPEndPoint)_listener.LocalEndPoint!).Port; _options.Port = actualPort; _serverInfo.Port = actualPort; } ``` Add public `Port` property on `NatsServer` exposing the resolved port. ### 4. Graceful Shutdown with WaitForShutdown New fields on `NatsServer`: - `_shutdown` (volatile bool) - `_shutdownComplete` (TaskCompletionSource) - `_quitCts` (CancellationTokenSource) — internal shutdown signal `ShutdownAsync()` sequence: 1. Guard: if already shutting down, return 2. Set `_shutdown = true`, cancel `_quitCts` 3. Close `_listener` (stops accept loop) 4. Close all client connections with `ServerShutdown` reason 5. Wait for active client tasks to drain 6. Stop monitor server 7. Signal `_shutdownComplete` `WaitForShutdown()`: blocks on `_shutdownComplete.Task`. `Dispose()`: calls `ShutdownAsync` synchronously if not already shut down. ### 5. Task Tracking Track active client tasks for clean shutdown: - `_activeClientCount` (int, Interlocked) - `_allClientsExited` (TaskCompletionSource, signaled when count hits 0 during shutdown) - Increment in `AcceptClientAsync`, decrement in `RunClientAsync` finally block - `ShutdownAsync` waits on `_allClientsExited` with timeout ### 6. Flush Pending Data Before Close `NatsClient.FlushAndCloseAsync(bool minimalFlush)`: - If not skip-flush reason: flush stream with 100ms write deadline - Close socket `MarkClosed(ClosedState)` sets skip-flush flag for: ReadError, WriteError, SlowConsumerPendingBytes, SlowConsumerWriteDeadline, TLSHandshakeError. ### 7. Lame Duck Mode New options: `LameDuckDuration` (default 2min), `LameDuckGracePeriod` (default 10s). `LameDuckShutdownAsync()`: 1. Set `_lameDuckMode = true` 2. Close listener (stop new connections) 3. Wait `LameDuckGracePeriod` (10s default) for clients to drain naturally 4. Stagger-close remaining clients over `LameDuckDuration - GracePeriod` - Sleep interval = remaining duration / client count (min 1ms, max 1s) - Randomize slightly to avoid reconnect storms 5. Call `ShutdownAsync()` for final cleanup Accept loop: on error, if `_lameDuckMode`, exit cleanly. ### 8. PID File & Ports File New options: `PidFile` (string?), `PortsFileDir` (string?). PID file: `File.WriteAllText(pidFile, Process.GetCurrentProcess().Id.ToString())` Ports file: JSON with `{ "client": port, "monitor": monitorPort }` written to `{dir}/{exe}_{pid}.ports` Written at startup, deleted at shutdown. ### 9. Signal Handling In `Program.cs`, use `PosixSignalRegistration` (.NET 6+): - `SIGTERM` → `server.ShutdownAsync()` then exit - `SIGUSR2` → `server.LameDuckShutdownAsync()` - `SIGUSR1` → log "log reopen not yet supported" - `SIGHUP` → log "config reload not yet supported" Keep existing Ctrl+C handler (SIGINT). ### 10. Server Identity NKey (Stub) Generate Ed25519 key pair at construction. Store as `ServerNKey` (public) and `_serverSeed` (private). Not used in protocol yet — placeholder for future cluster identity. ### 11. System Account (Stub) Create `$SYS` account in `_accounts` at construction. Expose as `SystemAccount` property. No internal subscriptions yet. ### 12. Config File & Profiling (Stubs) - `NatsOptions.ConfigFile` — if set, log warning "config file parsing not yet supported" - `NatsOptions.ProfPort` — if set, log warning "profiling endpoint not yet supported" - `Program.cs`: add `-c` CLI flag ## Testing - Accept loop backoff: mock socket that throws N times, verify delays - Ephemeral port: start server with port=0, verify resolved port > 0 - Graceful shutdown: start server, connect clients, call ShutdownAsync, verify all disconnected - WaitForShutdown: verify it blocks until shutdown completes - Close reason tracking: verify correct ClosedState for auth timeout, max connections, stale connection - Lame duck mode: start server, connect clients, trigger lame duck, verify staggered closure - PID file: start server with PidFile option, verify file contents, verify deleted on shutdown - Ports file: start server with PortsFileDir, verify JSON contents - Flush before close: verify data is flushed before socket close during shutdown - System account: verify $SYS account exists after construction