Files
natsdotnet/docs/plans/2026-03-12-e2e-extended-plan.md
Joseph Doherty c30e67a69d Fix E2E test gaps and add comprehensive E2E + parity test suites
- Fix pull consumer fetch: send original stream subject in HMSG (not inbox)
  so NATS client distinguishes data messages from control messages
- Fix MaxAge expiry: add background timer in StreamManager for periodic pruning
- Fix JetStream wire format: Go-compatible anonymous objects with string enums,
  proper offset-based pagination for stream/consumer list APIs
- Add 42 E2E black-box tests (core messaging, auth, TLS, accounts, JetStream)
- Add ~1000 parity tests across all subsystems (gaps closure)
- Update gap inventory docs to reflect implementation status
2026-03-12 14:09:23 -04:00

21 KiB

NATS.E2E.Tests Extended Coverage — Implementation Plan

Date: 2026-03-12 Design: 2026-03-12-e2e-extended-design.md

Batch Structure

7 batches. Each batch is independently verifiable. Phases 1-5 from the design map to batches 2-6. Batch 1 is infrastructure. Batch 7 is final verification.

Batch Steps Can Parallelize
1 — Infrastructure Steps 1-4 Steps 2-4 in parallel after Step 1
2 — Phase 1: Core Messaging Step 5 No
3 — Phase 2: Auth & Permissions Steps 6-7 No (fixture then tests)
4 — Phase 3: Monitoring Steps 8-9 No
5 — Phase 4: TLS & Accounts Steps 10-13 Steps 10-11 parallel, Steps 12-13 parallel
6 — Phase 5: JetStream Steps 14-15 No
7 — Final Verification Step 16 No

Batch 1: Infrastructure

Step 1: Update NuGet packages and csproj

Files:

  • Directory.Packages.props (edit)
  • tests/NATS.E2E.Tests/NATS.E2E.Tests.csproj (edit)

Details:

Add to Directory.Packages.props:

<PackageVersion Include="NATS.Client.JetStream" Version="2.7.2" />

Add to NATS.E2E.Tests.csproj <ItemGroup>:

<PackageReference Include="NATS.NKeys" />
<PackageReference Include="NATS.Client.JetStream" />

Verify: dotnet build tests/NATS.E2E.Tests succeeds.


Step 2: Extend NatsServerProcess

Files:

  • tests/NATS.E2E.Tests/Infrastructure/NatsServerProcess.cs (edit)

Details:

Add to the class:

  1. New constructor overload: NatsServerProcess(string[]? extraArgs = null, string? configContent = null, bool enableMonitoring = false)

    • Stores _extraArgs, _configContent, _enableMonitoring
    • If enableMonitoring, allocate a second free port → MonitorPort
    • Keep existing no-arg constructor as-is (calls new one with defaults)
  2. New property: int? MonitorPort { get; }

  3. Config file support in StartAsync():

    • If _configContent is not null, write to a temp file (Path.GetTempFileName() with .conf extension), store path in _configFilePath
    • Build args: exec "{dll}" -p {Port} + (if config: -c {_configFilePath}) + (if monitoring: -m {MonitorPort}) + (extra args)
  4. Cleanup in DisposeAsync(): Delete _configFilePath if it exists.

  5. Static factory: static NatsServerProcess WithConfig(string configContent, bool enableMonitoring = false) — convenience for creating with config.

Verify: dotnet build tests/NATS.E2E.Tests succeeds. Existing tests still pass (dotnet test tests/NATS.E2E.Tests).


Step 3: Create E2ETestHelper

Files:

  • tests/NATS.E2E.Tests/Infrastructure/E2ETestHelper.cs (new)

Details:

namespace NATS.E2E.Tests.Infrastructure;

public static class E2ETestHelper
{
    public static NatsConnection CreateClient(int port)
        => new(new NatsOpts { Url = $"nats://127.0.0.1:{port}" });

    public static NatsConnection CreateClient(int port, NatsOpts opts)
        => new(opts with { Url = $"nats://127.0.0.1:{port}" });

    public static CancellationToken Timeout(int seconds = 10)
        => new CancellationTokenSource(TimeSpan.FromSeconds(seconds)).Token;
}

Verify: Builds.


Step 4: Create collection definitions file

Files:

  • tests/NATS.E2E.Tests/Infrastructure/Collections.cs (new)

Details:

Move existing E2ECollection from NatsServerFixture.cs into this file. Add all collection definitions:

[CollectionDefinition("E2E")]
public class E2ECollection : ICollectionFixture<NatsServerFixture>;

[CollectionDefinition("E2E-Auth")]
public class AuthCollection : ICollectionFixture<AuthServerFixture>;

[CollectionDefinition("E2E-Monitor")]
public class MonitorCollection : ICollectionFixture<MonitorServerFixture>;

[CollectionDefinition("E2E-TLS")]
public class TlsCollection : ICollectionFixture<TlsServerFixture>;

[CollectionDefinition("E2E-Accounts")]
public class AccountsCollection : ICollectionFixture<AccountServerFixture>;

[CollectionDefinition("E2E-JetStream")]
public class JetStreamCollection : ICollectionFixture<JetStreamServerFixture>;

Remove E2ECollection from NatsServerFixture.cs.

Note: The fixture classes referenced here (AuthServerFixture, etc.) don't exist yet — they'll be created in later steps. This file will have build errors until then; that's fine as long as we build after each batch completes.

Actually — to keep each batch independently verifiable, only add the E2E collection definition here in Step 4. The other collection definitions will be added in their respective fixture files in later batches.

Verify: dotnet test tests/NATS.E2E.Tests — existing 3 tests still pass.


Batch 2: Phase 1 — Core Messaging

Step 5: Implement CoreMessagingTests

Files:

  • tests/NATS.E2E.Tests/CoreMessagingTests.cs (new)

Details:

[Collection("E2E")] — uses existing NatsServerFixture. Primary constructor takes NatsServerFixture fixture.

11 tests:

  1. WildcardStar_MatchesSingleToken: Sub foo.*, pub foo.bar → assert received with correct data.

  2. WildcardGreaterThan_MatchesMultipleTokens: Sub foo.>, pub foo.bar.baz → assert received.

  3. WildcardStar_DoesNotMatchMultipleTokens: Sub foo.*, pub foo.bar.baz → assert NO message within 1s timeout (use Task.WhenAny with short delay to prove no delivery).

  4. QueueGroup_LoadBalances: 3 clients sub to qtest with queue group workers. Pub client sends 30 messages. Each sub collects received messages. Assert: total across all 3 = 30, each sub got at least 1 (no single sub got all).

  5. QueueGroup_MixedWithPlainSub: 1 plain sub + 2 queue subs on qmix. Pub 10 messages. Plain sub should get all 10. Queue subs combined should get 10 (each message to exactly 1 queue sub).

  6. Unsub_StopsDelivery: Sub to unsub.test, ping to flush, then unsubscribe, pub → assert no message within 1s.

  7. Unsub_WithMaxMessages: Sub to maxmsg.test. Use raw socket or low-level NATS protocol to send UNSUB sid 3. Pub 5 messages → assert exactly 3 received. Note: NATS.Client.Core may not expose auto-unsub-after-N directly. If not, use raw socket for this test.

  8. FanOut_MultipleSubscribers: 3 clients sub to fanout.test. Pub 1 message. All 3 receive it.

  9. EchoOff_PublisherDoesNotReceiveSelf: Connect with NatsOpts { Echo = false }. Sub to echo.test, pub to echo.test. Assert no message within 1s. Then connect a second client (default echo=true), sub and pub → that client DOES receive its own message (as control).

  10. VerboseMode_OkResponses: Use raw TcpClient/NetworkStream. Send CONNECT {"verbose":true}\r\n → read +OK. Send SUB test 1\r\n → read +OK. Send PING\r\n → read PONG.

  11. NoResponders_Returns503: Connect with NatsOpts { Headers = true, NoResponders = true } (check if NATS.Client.Core exposes this). Send request to subject with no subscribers → expect exception or 503 status in reply headers.

For negative tests (no message expected), use a short 500ms-1s timeout with Task.WhenAny(readTask, Task.Delay(1000)) pattern — assert the delay wins.

Verify: dotnet test tests/NATS.E2E.Tests — all 14 tests pass (3 original + 11 new).


Batch 3: Phase 2 — Auth & Permissions

Step 6: Implement AuthServerFixture

Files:

  • tests/NATS.E2E.Tests/Infrastructure/AuthServerFixture.cs (new)

Details:

Class AuthServerFixture : IAsyncLifetime.

At construction time, generate an NKey pair using NATS.NKeys:

var kp = KeyPair.CreateUser();
NKeyPublicKey = kp.EncodedPublicKey;  // starts with 'U'
NKeySeed = kp.EncodedSeed;           // starts with 'SU'

Store these as public properties so tests can use them.

Config content (NATS conf format):

max_payload: 512
authorization {
  users = [
    { user: "testuser", password: "testpass" }
    { user: "tokenuser", password: "s3cret_token" }
    { user: "pubonly", password: "pubpass", permissions: { publish: { allow: ["allowed.>"] }, subscribe: { allow: ["_INBOX.>"] } } }
    { user: "subonly", password: "subpass", permissions: { subscribe: { allow: ["allowed.>", "_INBOX.>"] }, publish: { allow: ["_INBOX.>"] } } }
    { user: "limited", password: "limpass", permissions: { publish: ">", subscribe: ">" } }
    { nkey: "<NKEY_PUBLIC_KEY>" }
  ]
}

Wait — token auth uses authorization { token: "..." } which is separate from users. We can't mix both in one config. Instead, use separate users for each auth mechanism and test user/pass. For token auth, we need a separate fixture or a workaround.

Simpler approach: use a config with users only (user/pass, nkeys, permissions). For token auth, we can test it with a dedicated NatsServerProcess instance inside the test itself (create server, run test, dispose). This keeps the fixture simple.

Actually, let's keep it simpler: make AuthServerFixture handle user/pass + nkeys + permissions. Add the token tests and max_payload test as standalone tests that spin up their own mini-server via NatsServerProcess.

Properties exposed:

  • int Port
  • string NKeyPublicKey
  • string NKeySeed
  • NatsConnection CreateClient(string user, string password) — connects with credentials
  • NatsConnection CreateClient() — connects without credentials (should fail on auth-required server)

Collection definition: [CollectionDefinition("E2E-Auth")] in this file.

Verify: Builds.


Step 7: Implement AuthTests

Files:

  • tests/NATS.E2E.Tests/AuthTests.cs (new)

Details:

[Collection("E2E-Auth")] with AuthServerFixture fixture.

12 tests:

  1. UsernamePassword_ValidCredentials_Connects: fixture.CreateClient("testuser", "testpass") → connect, ping → succeeds.

  2. UsernamePassword_InvalidPassword_Rejected: Connect with wrong password → expect NatsException on connect.

  3. UsernamePassword_NoCredentials_Rejected: fixture.CreateClient() (no creds) → expect connection error.

  4. TokenAuth_ValidToken_Connects: Spin up a temp NatsServerProcess with config authorization { token: "s3cret" }. Connect with NatsOpts { AuthToken = "s3cret" } → succeeds.

  5. TokenAuth_InvalidToken_Rejected: Same temp server, wrong token → rejected.

  6. NKeyAuth_ValidSignature_Connects: Connect with NatsOpts configured for NKey auth using fixture.NKeySeed → succeeds.

  7. NKeyAuth_InvalidSignature_Rejected: Connect with a different NKey seed → rejected.

  8. Permission_PublishAllowed_Succeeds: pubonly user pubs to allowed.foo, testuser sub on same → message delivered.

  9. Permission_PublishDenied_NoDelivery: pubonly user pubs to denied.foo → permission violation, message not delivered.

  10. Permission_SubscribeDenied_Rejected: pubonly user tries to sub to denied.foo → error or no messages.

  11. MaxSubscriptions_ExceedsLimit_Rejected: Use limited user config with max_subs: 5 added to fixture config. Create 6 subs → last one triggers error.

  12. MaxPayload_ExceedsLimit_Disconnected: Fixture config has max_payload: 512. Send message > 512 bytes → connection closed.

For tests 4-5 (token auth): create/dispose their own NatsServerProcess within the test. Use await using for cleanup.

Verify: dotnet test tests/NATS.E2E.Tests — all 25 tests pass (14 + 11 new; token tests may take slightly longer due to extra server startup).

Note: Token tests spin up independent servers, so they'll be slightly slower. That's acceptable for E2E.


Batch 4: Phase 3 — Monitoring

Step 8: Implement MonitorServerFixture

Files:

  • tests/NATS.E2E.Tests/Infrastructure/MonitorServerFixture.cs (new)

Details:

Class MonitorServerFixture : IAsyncLifetime.

Creates NatsServerProcess with enableMonitoring: true. This passes -m <port> to the server.

Properties:

  • int Port — NATS client port
  • int MonitorPort — HTTP monitoring port
  • HttpClient MonitorClient — pre-configured with BaseAddress = new Uri($"http://127.0.0.1:{MonitorPort}")
  • NatsConnection CreateClient()

Dispose: dispose HttpClient and server process.

Collection definition: [CollectionDefinition("E2E-Monitor")] in this file.

Verify: Builds.


Step 9: Implement MonitoringTests

Files:

  • tests/NATS.E2E.Tests/MonitoringTests.cs (new)

Details:

[Collection("E2E-Monitor")] with MonitorServerFixture fixture.

All tests use fixture.MonitorClient for HTTP calls and System.Text.Json.JsonDocument for JSON parsing.

7 tests:

  1. Healthz_ReturnsOk: GET /healthz → 200, body contains "status" key with value "ok".

  2. Varz_ReturnsServerInfo: GET /varz → 200, JSON has server_id (non-empty string), version, port (matches fixture port).

  3. Varz_ReflectsMessageCounts: Connect client, pub 5 messages to a subject (with a sub to ensure delivery). GET /varzin_msgs >= 5.

  4. Connz_ListsActiveConnections: Connect 2 clients, ping both. GET /connznum_connections >= 2, connections array has entries.

  5. Connz_SortByParameter: Connect 3 clients, send different amounts of data. GET /connz?sort=bytes_toconnections array returned (verify it doesn't error; exact sort validation optional).

  6. Connz_LimitAndOffset: Connect 5 clients. GET /connz?limit=2&offset=1connections array has exactly 2 entries.

  7. Subz_ReturnsSubscriptionStats: Connect client, sub to 3 subjects. GET /subz → response has subscription data, num_subscriptions > 0.

Verify: dotnet test tests/NATS.E2E.Tests — all 32 tests pass (25 + 7).


Batch 5: Phase 4 — TLS & Accounts

Step 10: Implement TlsServerFixture

Files:

  • tests/NATS.E2E.Tests/Infrastructure/TlsServerFixture.cs (new)

Details:

Class TlsServerFixture : IAsyncLifetime.

In InitializeAsync():

  1. Create a temp directory for certs.
  2. Generate self-signed CA, server cert, client cert using System.Security.Cryptography:
    • CA: RSA 2048, self-signed, CN=E2E Test CA
    • Server cert: RSA 2048, signed by CA, CN=localhost, SAN=127.0.0.1
    • Client cert: RSA 2048, signed by CA, CN=testclient
  3. Export to PEM files in temp dir: ca.pem, server-cert.pem, server-key.pem, client-cert.pem, client-key.pem
  4. Create NatsServerProcess with config:
listen: "0.0.0.0:{port}"
tls {
  cert_file: "{server-cert.pem}"
  key_file: "{server-key.pem}"
  ca_file: "{ca.pem}"
}
  1. Start server.

Properties:

  • int Port
  • string CaCertPath, string ClientCertPath, string ClientKeyPath
  • NatsConnection CreateTlsClient() — creates client with TLS configured, trusting the test CA
  • NatsConnection CreatePlainClient() — creates client WITHOUT TLS (for rejection test)

Dispose: stop server, delete temp cert directory.

Collection definition: [CollectionDefinition("E2E-TLS")] in this file.

Verify: Builds.


Step 11: Implement AccountServerFixture

Files:

  • tests/NATS.E2E.Tests/Infrastructure/AccountServerFixture.cs (new)

Details:

Class AccountServerFixture : IAsyncLifetime.

Config:

accounts {
  ACCT_A {
    users = [
      { user: "user_a", password: "pass_a" }
    ]
  }
  ACCT_B {
    users = [
      { user: "user_b", password: "pass_b" }
    ]
  }
}

Properties:

  • int Port
  • NatsConnection CreateClientA() — connects as user_a
  • NatsConnection CreateClientB() — connects as user_b

Collection definition: [CollectionDefinition("E2E-Accounts")] in this file.

Verify: Builds.


Step 12: Implement TlsTests

Files:

  • tests/NATS.E2E.Tests/TlsTests.cs (new)

Details:

[Collection("E2E-TLS")] with TlsServerFixture fixture.

3 tests:

  1. Tls_ClientConnectsSecurely: fixture.CreateTlsClient() → connect, ping → succeeds.

  2. Tls_PlainTextConnection_Rejected: fixture.CreatePlainClient() → connect → expect exception (timeout or auth error since TLS handshake fails).

  3. Tls_PubSub_WorksOverEncryptedConnection: Two TLS clients, pub/sub round-trip → message received.

Verify: Builds, TLS tests pass.


Step 13: Implement AccountIsolationTests

Files:

  • tests/NATS.E2E.Tests/AccountIsolationTests.cs (new)

Details:

[Collection("E2E-Accounts")] with AccountServerFixture fixture.

3 tests:

  1. Accounts_SameAccount_MessageDelivered: Two ACCT_A clients. Sub + pub on acct.test → message received.

  2. Accounts_CrossAccount_MessageNotDelivered: ACCT_A client pubs to cross.test, ACCT_B client subs to cross.test → no message within 1s.

  3. Accounts_EachAccountHasOwnNamespace: ACCT_A sub on shared.topic, ACCT_B sub on shared.topic. Pub from ACCT_A → only ACCT_A sub receives. Pub from ACCT_B → only ACCT_B sub receives.

Verify: dotnet test tests/NATS.E2E.Tests — all 38 tests pass (32 + 6).


Batch 6: Phase 5 — JetStream

Step 14: Implement JetStreamServerFixture

Files:

  • tests/NATS.E2E.Tests/Infrastructure/JetStreamServerFixture.cs (new)

Details:

Class JetStreamServerFixture : IAsyncLifetime.

Config:

listen: "0.0.0.0:{port}"
jetstream {
  store_dir: "{tmpdir}"
  max_mem_store: 64mb
  max_file_store: 256mb
}

Where {tmpdir} is created via Path.Combine(Path.GetTempPath(), "nats-e2e-js-" + Guid.NewGuid().ToString("N")[..8]).

Properties:

  • int Port
  • NatsConnection CreateClient()

Dispose: stop server, delete store_dir.

Collection definition: [CollectionDefinition("E2E-JetStream")] in this file.

Verify: Builds.


Step 15: Implement JetStreamTests

Files:

  • tests/NATS.E2E.Tests/JetStreamTests.cs (new)

Details:

[Collection("E2E-JetStream")] with JetStreamServerFixture fixture.

Uses NATS.Client.JetStream NuGet — create NatsJSContext from the connection.

10 tests:

  1. Stream_CreateAndInfo: Create stream TEST1 on subjects ["js.test.>"] with limits retention. Get stream info → verify name, subjects, retention policy match.

  2. Stream_ListAndNames: Create 3 streams (LIST_A, LIST_B, LIST_C). List streams → all 3 present. Get names → all 3 names returned.

  3. Stream_Delete: Create stream DEL_TEST, delete it, attempt info → expect not-found error.

  4. Stream_PublishAndGet: Create stream on js.pub.>. Publish 3 messages. Get message by sequence 1, 2, 3 → verify data matches.

  5. Stream_Purge: Create stream, publish 5 messages. Purge. Get stream info → state.messages == 0.

  6. Consumer_CreatePullAndConsume: Create stream + pull consumer. Publish 5 messages. Pull next batch (5) → receive all 5 with correct data.

  7. Consumer_AckExplicit: Create stream + consumer with explicit ack. Publish message. Pull, ack it. Pull again → no more messages (not redelivered).

  8. Consumer_ListAndDelete: Create stream + 2 consumers. List consumers → 2 present. Delete one. List → 1 remaining.

  9. Retention_LimitsMaxMessages: Create stream with MaxMsgs: 10. Publish 15 messages. Stream info → state.messages == 10, first seq is 6.

  10. Retention_MaxAge: Create stream with MaxAge: TimeSpan.FromSeconds(2). Publish messages. Wait 3s. Stream info → state.messages == 0.

Each test uses unique stream/subject names to avoid interference (tests share one JetStream server).

Verify: dotnet test tests/NATS.E2E.Tests — all 48 tests pass (38 + 10).


Batch 7: Final Verification

Step 16: Full build and test run

Commands:

dotnet build
dotnet test tests/NATS.E2E.Tests -v normal

Success criteria: Solution builds clean, all 49 tests pass (3 original + 46 new).


File Summary

File Action Batch
Directory.Packages.props edit 1
NATS.E2E.Tests.csproj edit 1
Infrastructure/NatsServerProcess.cs edit 1
Infrastructure/E2ETestHelper.cs new 1
Infrastructure/NatsServerFixture.cs edit (remove collection def) 1
Infrastructure/Collections.cs new 1
CoreMessagingTests.cs new 2
Infrastructure/AuthServerFixture.cs new 3
AuthTests.cs new 3
Infrastructure/MonitorServerFixture.cs new 4
MonitoringTests.cs new 4
Infrastructure/TlsServerFixture.cs new 5
Infrastructure/AccountServerFixture.cs new 5
TlsTests.cs new 5
AccountIsolationTests.cs new 5
Infrastructure/JetStreamServerFixture.cs new 6
JetStreamTests.cs new 6

Agent Model Guidance

  • Batch 1 (infrastructure): Opus — involves modifying existing code carefully
  • Batches 2-6 (test phases): Sonnet — straightforward test implementation from spec
  • Batch 7 (verify): Either — just running commands
  • Parallel agents within Batch 5: Steps 10-11 (fixtures) can run in parallel, Steps 12-13 (tests) can run in parallel