feat: add resilient reconnect and catch-up replay

This commit is contained in:
Joseph Doherty
2026-03-17 11:04:19 -04:00
parent c278f98496
commit 2f04ec9d1d
29 changed files with 3746 additions and 95 deletions

View File

@@ -0,0 +1,583 @@
# Catch-Up Replay And Advanced Retry Implementation Plan
> **For Codex:** REQUIRED SUB-SKILL: Use `executeplan` to implement this plan task-by-task.
**Goal:** Add best-effort latest-value catch-up after reconnect and replace the fixed reconnect delay schedule with a production-grade retry policy, while also fixing the current reconnect quality issues.
**Architecture:** Extend the existing reconnect runtime with a small runtime-options layer, a retry-policy calculator, and a post-reconnect catch-up refresh phase. Keep reconnect success defined as restored live subscriptions, and treat catch-up as a best-effort follow-on phase that emits synthetic updates marked separately from live traffic.
**Tech Stack:** .NET 10, C#, xUnit, existing SuiteLink protocol/client/runtime/transport layers
---
### Task 1: Add Runtime Option Types
**Files:**
- Create: `/Users/dohertj2/Desktop/suitelinkclient/src/SuiteLink.Client/SuiteLinkRuntimeOptions.cs`
- Create: `/Users/dohertj2/Desktop/suitelinkclient/src/SuiteLink.Client/SuiteLinkRetryPolicy.cs`
- Create: `/Users/dohertj2/Desktop/suitelinkclient/src/SuiteLink.Client/SuiteLinkCatchUpPolicy.cs`
- Modify: `/Users/dohertj2/Desktop/suitelinkclient/src/SuiteLink.Client/SuiteLinkConnectionOptions.cs`
- Test: `/Users/dohertj2/Desktop/suitelinkclient/tests/SuiteLink.Client.Tests/SuiteLinkConnectionOptionsTests.cs`
**Step 1: Write the failing test**
```csharp
[Fact]
public void ConnectionOptions_DefaultsRuntimeOptions()
{
var options = new SuiteLinkConnectionOptions(
host: "127.0.0.1",
application: "App",
topic: "Topic",
clientName: "Client",
clientNode: "Node",
userName: "User",
serverNode: "Server");
Assert.NotNull(options.Runtime);
Assert.Equal(SuiteLinkCatchUpPolicy.None, options.Runtime.CatchUpPolicy);
Assert.NotNull(options.Runtime.RetryPolicy);
}
```
**Step 2: Run test to verify it fails**
Run: `dotnet test /Users/dohertj2/Desktop/suitelinkclient/SuiteLink.Client.slnx --filter ConnectionOptions_DefaultsRuntimeOptions -v minimal`
Expected: FAIL because runtime options do not exist yet
**Step 3: Write minimal implementation**
Create:
```csharp
public enum SuiteLinkCatchUpPolicy
{
None = 0,
RefreshLatestValue = 1
}
```
```csharp
public sealed record class SuiteLinkRetryPolicy(
TimeSpan InitialDelay,
double Multiplier,
TimeSpan MaxDelay,
int? MaxAttempts = null,
bool UseJitter = true)
{
public static SuiteLinkRetryPolicy Default { get; } =
new(TimeSpan.FromSeconds(1), 2.0, TimeSpan.FromSeconds(30));
}
```
```csharp
public sealed record class SuiteLinkRuntimeOptions(
SuiteLinkRetryPolicy RetryPolicy,
SuiteLinkCatchUpPolicy CatchUpPolicy,
TimeSpan CatchUpTimeout)
{
public static SuiteLinkRuntimeOptions Default { get; } =
new(SuiteLinkRetryPolicy.Default, SuiteLinkCatchUpPolicy.None, TimeSpan.FromSeconds(2));
}
```
Update `SuiteLinkConnectionOptions` to expose:
```csharp
public SuiteLinkRuntimeOptions Runtime { get; }
```
and default it to `SuiteLinkRuntimeOptions.Default`.
**Step 4: Run test to verify it passes**
Run: `dotnet test /Users/dohertj2/Desktop/suitelinkclient/SuiteLink.Client.slnx --filter SuiteLinkConnectionOptionsTests -v minimal`
Expected: PASS
**Step 5: Commit**
```bash
git add /Users/dohertj2/Desktop/suitelinkclient/src/SuiteLink.Client/SuiteLinkRuntimeOptions.cs /Users/dohertj2/Desktop/suitelinkclient/src/SuiteLink.Client/SuiteLinkRetryPolicy.cs /Users/dohertj2/Desktop/suitelinkclient/src/SuiteLink.Client/SuiteLinkCatchUpPolicy.cs /Users/dohertj2/Desktop/suitelinkclient/src/SuiteLink.Client/SuiteLinkConnectionOptions.cs /Users/dohertj2/Desktop/suitelinkclient/tests/SuiteLink.Client.Tests/SuiteLinkConnectionOptionsTests.cs
git commit -m "feat: add runtime reconnect option types"
```
### Task 2: Add Retry Policy Delay Calculator
**Files:**
- Create: `/Users/dohertj2/Desktop/suitelinkclient/src/SuiteLink.Client/Internal/SuiteLinkRetryDelayCalculator.cs`
- Test: `/Users/dohertj2/Desktop/suitelinkclient/tests/SuiteLink.Client.Tests/Internal/SuiteLinkRetryDelayCalculatorTests.cs`
**Step 1: Write the failing test**
```csharp
[Fact]
public void GetDelay_UsesImmediateThenExponentialCap()
{
var policy = new SuiteLinkRetryPolicy(
InitialDelay: TimeSpan.FromSeconds(1),
Multiplier: 2.0,
MaxDelay: TimeSpan.FromSeconds(30),
UseJitter: false);
Assert.Equal(TimeSpan.Zero, SuiteLinkRetryDelayCalculator.GetDelay(policy, 0));
Assert.Equal(TimeSpan.FromSeconds(1), SuiteLinkRetryDelayCalculator.GetDelay(policy, 1));
Assert.Equal(TimeSpan.FromSeconds(2), SuiteLinkRetryDelayCalculator.GetDelay(policy, 2));
Assert.Equal(TimeSpan.FromSeconds(4), SuiteLinkRetryDelayCalculator.GetDelay(policy, 3));
}
```
**Step 2: Run test to verify it fails**
Run: `dotnet test /Users/dohertj2/Desktop/suitelinkclient/SuiteLink.Client.slnx --filter SuiteLinkRetryDelayCalculatorTests -v minimal`
Expected: FAIL because calculator does not exist yet
**Step 3: Write minimal implementation**
Create:
```csharp
internal static class SuiteLinkRetryDelayCalculator
{
public static TimeSpan GetDelay(SuiteLinkRetryPolicy policy, int attempt)
{
if (attempt == 0)
{
return TimeSpan.Zero;
}
var rawSeconds = policy.InitialDelay.TotalSeconds * Math.Pow(policy.Multiplier, attempt - 1);
var bounded = TimeSpan.FromSeconds(Math.Min(rawSeconds, policy.MaxDelay.TotalSeconds));
return bounded;
}
}
```
Do not add jitter yet beyond the policy flag unless tests require it.
**Step 4: Run test to verify it passes**
Run: `dotnet test /Users/dohertj2/Desktop/suitelinkclient/SuiteLink.Client.slnx --filter SuiteLinkRetryDelayCalculatorTests -v minimal`
Expected: PASS
**Step 5: Commit**
```bash
git add /Users/dohertj2/Desktop/suitelinkclient/src/SuiteLink.Client/Internal/SuiteLinkRetryDelayCalculator.cs /Users/dohertj2/Desktop/suitelinkclient/tests/SuiteLink.Client.Tests/Internal/SuiteLinkRetryDelayCalculatorTests.cs
git commit -m "feat: add reconnect retry delay calculator"
```
### Task 3: Wire Retry Policy Into Reconnect Runtime
**Files:**
- Modify: `/Users/dohertj2/Desktop/suitelinkclient/src/SuiteLink.Client/SuiteLinkClient.cs`
- Modify: `/Users/dohertj2/Desktop/suitelinkclient/tests/SuiteLink.Client.Tests/SuiteLinkClientReconnectTests.cs`
**Step 1: Write the failing test**
```csharp
[Fact]
public async Task Reconnect_UsesConfiguredRetryPolicy()
{
var observed = new List<TimeSpan>();
var options = CreateOptions() with
{
Runtime = new SuiteLinkRuntimeOptions(
new SuiteLinkRetryPolicy(TimeSpan.FromSeconds(3), 3.0, TimeSpan.FromSeconds(20), UseJitter: false),
SuiteLinkCatchUpPolicy.None,
TimeSpan.FromSeconds(2))
};
var client = CreateReconnectClient(delayAsync: (delay, _) =>
{
observed.Add(delay);
return Task.CompletedTask;
});
await client.ConnectAsync(options);
await EventuallyReconnectAsync(client);
Assert.Contains(TimeSpan.FromSeconds(3), observed);
}
```
**Step 2: Run test to verify it fails**
Run: `dotnet test /Users/dohertj2/Desktop/suitelinkclient/SuiteLink.Client.slnx --filter Reconnect_UsesConfiguredRetryPolicy -v minimal`
Expected: FAIL because reconnect still uses a fixed schedule
**Step 3: Write minimal implementation**
In `SuiteLinkClient`:
- remove direct use of `ReconnectDelaySchedule`
- read retry policy from `_connectionOptions!.Runtime.RetryPolicy`
- use `SuiteLinkRetryDelayCalculator.GetDelay(policy, attempt)`
Keep the current injected `_delayAsync` test seam.
**Step 4: Run test to verify it passes**
Run: `dotnet test /Users/dohertj2/Desktop/suitelinkclient/SuiteLink.Client.slnx --filter SuiteLinkClientReconnectTests -v minimal`
Expected: PASS
**Step 5: Commit**
```bash
git add /Users/dohertj2/Desktop/suitelinkclient/src/SuiteLink.Client/SuiteLinkClient.cs /Users/dohertj2/Desktop/suitelinkclient/tests/SuiteLink.Client.Tests/SuiteLinkClientReconnectTests.cs
git commit -m "feat: apply retry policy to reconnect runtime"
```
### Task 4: Fix Fast-Fail Writes During Reconnect
**Files:**
- Modify: `/Users/dohertj2/Desktop/suitelinkclient/src/SuiteLink.Client/SuiteLinkClient.cs`
- Modify: `/Users/dohertj2/Desktop/suitelinkclient/tests/SuiteLink.Client.Tests/SuiteLinkClientWriteTests.cs`
**Step 1: Write the failing test**
```csharp
[Fact]
public async Task WriteAsync_DuringReconnect_ThrowsBeforeWaitingOnOperationGate()
{
var client = CreateClientWithBlockedOperationGateAndReconnectState();
var ex = await Assert.ThrowsAsync<InvalidOperationException>(
() => client.WriteAsync("Pump001.Run", SuiteLinkValue.FromBoolean(true)));
Assert.Contains("reconnecting", ex.Message, StringComparison.OrdinalIgnoreCase);
}
```
**Step 2: Run test to verify it fails**
Run: `dotnet test /Users/dohertj2/Desktop/suitelinkclient/SuiteLink.Client.slnx --filter WriteAsync_DuringReconnect_ThrowsBeforeWaitingOnOperationGate -v minimal`
Expected: FAIL because `WriteAsync` currently waits on `_operationGate` first
**Step 3: Write minimal implementation**
Move the reconnect state check ahead of:
```csharp
await _operationGate.WaitAsync(...)
```
while keeping disposed-state checks intact.
**Step 4: Run test to verify it passes**
Run: `dotnet test /Users/dohertj2/Desktop/suitelinkclient/SuiteLink.Client.slnx --filter SuiteLinkClientWriteTests -v minimal`
Expected: PASS
**Step 5: Commit**
```bash
git add /Users/dohertj2/Desktop/suitelinkclient/src/SuiteLink.Client/SuiteLinkClient.cs /Users/dohertj2/Desktop/suitelinkclient/tests/SuiteLink.Client.Tests/SuiteLinkClientWriteTests.cs
git commit -m "fix: fail writes before reconnect gate contention"
```
### Task 5: Fix Transport Reset Ownership Semantics
**Files:**
- Modify: `/Users/dohertj2/Desktop/suitelinkclient/src/SuiteLink.Client/Transport/SuiteLinkTcpTransport.cs`
- Modify: `/Users/dohertj2/Desktop/suitelinkclient/src/SuiteLink.Client/Transport/ISuiteLinkReconnectableTransport.cs`
- Modify: `/Users/dohertj2/Desktop/suitelinkclient/tests/SuiteLink.Client.Tests/Transport/SuiteLinkTcpTransportTests.cs`
**Step 1: Write the failing test**
```csharp
[Fact]
public async Task ResetConnectionAsync_LeaveOpenTrue_DoesNotDisposeInjectedStream()
{
var stream = new TrackingStream();
await using var transport = new SuiteLinkTcpTransport(stream, leaveOpen: true);
await transport.ResetConnectionAsync();
Assert.False(stream.WasDisposed);
}
```
**Step 2: Run test to verify it fails**
Run: `dotnet test /Users/dohertj2/Desktop/suitelinkclient/SuiteLink.Client.slnx --filter ResetConnectionAsync_LeaveOpenTrue_DoesNotDisposeInjectedStream -v minimal`
Expected: FAIL because reset currently disposes caller-owned resources
**Step 3: Write minimal implementation**
Update `ResetConnectionAsync` to respect the same ownership rule as `DisposeAsync`:
- if `leaveOpen` is `true`, detach without disposing injected resources
- if `leaveOpen` is `false`, dispose detached resources
Do not broaden interface scope unnecessarily.
**Step 4: Run test to verify it passes**
Run: `dotnet test /Users/dohertj2/Desktop/suitelinkclient/SuiteLink.Client.slnx --filter SuiteLinkTcpTransportTests -v minimal`
Expected: PASS
**Step 5: Commit**
```bash
git add /Users/dohertj2/Desktop/suitelinkclient/src/SuiteLink.Client/Transport/SuiteLinkTcpTransport.cs /Users/dohertj2/Desktop/suitelinkclient/src/SuiteLink.Client/Transport/ISuiteLinkReconnectableTransport.cs /Users/dohertj2/Desktop/suitelinkclient/tests/SuiteLink.Client.Tests/Transport/SuiteLinkTcpTransportTests.cs
git commit -m "fix: preserve transport ownership during reconnect reset"
```
### Task 6: Add Update Source Metadata
**Files:**
- Create: `/Users/dohertj2/Desktop/suitelinkclient/src/SuiteLink.Client/SuiteLinkUpdateSource.cs`
- Modify: `/Users/dohertj2/Desktop/suitelinkclient/src/SuiteLink.Client/SuiteLinkTagUpdate.cs`
- Test: `/Users/dohertj2/Desktop/suitelinkclient/tests/SuiteLink.Client.Tests/SuiteLinkValueTests.cs`
**Step 1: Write the failing test**
```csharp
[Fact]
public void TagUpdate_DefaultSource_IsLive()
{
var update = new SuiteLinkTagUpdate(
"Pump001.Run",
1,
SuiteLinkValue.FromBoolean(true),
0x00C0,
1,
DateTimeOffset.UtcNow);
Assert.Equal(SuiteLinkUpdateSource.Live, update.Source);
}
```
**Step 2: Run test to verify it fails**
Run: `dotnet test /Users/dohertj2/Desktop/suitelinkclient/SuiteLink.Client.slnx --filter TagUpdate_DefaultSource_IsLive -v minimal`
Expected: FAIL because source metadata does not exist
**Step 3: Write minimal implementation**
Create:
```csharp
public enum SuiteLinkUpdateSource
{
Live = 0,
CatchUpReplay = 1
}
```
Add `Source` to `SuiteLinkTagUpdate` with default:
```csharp
SuiteLinkUpdateSource.Live
```
**Step 4: Run test to verify it passes**
Run: `dotnet test /Users/dohertj2/Desktop/suitelinkclient/SuiteLink.Client.slnx --filter SuiteLinkTagUpdate -v minimal`
Expected: PASS
**Step 5: Commit**
```bash
git add /Users/dohertj2/Desktop/suitelinkclient/src/SuiteLink.Client/SuiteLinkUpdateSource.cs /Users/dohertj2/Desktop/suitelinkclient/src/SuiteLink.Client/SuiteLinkTagUpdate.cs /Users/dohertj2/Desktop/suitelinkclient/tests/SuiteLink.Client.Tests/SuiteLinkValueTests.cs
git commit -m "feat: add update source metadata"
```
### Task 7: Add Best-Effort Catch-Up Refresh Execution
**Files:**
- Modify: `/Users/dohertj2/Desktop/suitelinkclient/src/SuiteLink.Client/SuiteLinkClient.cs`
- Modify: `/Users/dohertj2/Desktop/suitelinkclient/src/SuiteLink.Client/Internal/SubscriptionRegistrationEntry.cs`
- Modify: `/Users/dohertj2/Desktop/suitelinkclient/tests/SuiteLink.Client.Tests/SuiteLinkClientReconnectTests.cs`
**Step 1: Write the failing test**
```csharp
[Fact]
public async Task Reconnect_WithRefreshLatestValue_CanDispatchCatchUpReplay()
{
SuiteLinkTagUpdate? catchUp = null;
var client = CreateReconnectReplayClient(
catchUpPolicy: SuiteLinkCatchUpPolicy.RefreshLatestValue,
onUpdate: update =>
{
if (update.Source == SuiteLinkUpdateSource.CatchUpReplay)
{
catchUp = update;
}
});
await client.ConnectAsync(CreateOptionsWithCatchUp());
Assert.NotNull(catchUp);
Assert.Equal(SuiteLinkUpdateSource.CatchUpReplay, catchUp.Source);
}
```
**Step 2: Run test to verify it fails**
Run: `dotnet test /Users/dohertj2/Desktop/suitelinkclient/SuiteLink.Client.slnx --filter Reconnect_WithRefreshLatestValue_CanDispatchCatchUpReplay -v minimal`
Expected: FAIL because reconnect only resumes live dispatch today
**Step 3: Write minimal implementation**
After successful reconnect and durable subscription replay:
- if `Runtime.CatchUpPolicy == SuiteLinkCatchUpPolicy.RefreshLatestValue`
- run a sequential refresh pass over durable subscriptions
- obtain one fresh value per item using existing temporary-read machinery or a dedicated internal refresh path
- dispatch synthetic updates with:
```csharp
Source: SuiteLinkUpdateSource.CatchUpReplay
```
Do not fail reconnect if one item refresh fails or times out.
**Step 4: Run test to verify it passes**
Run: `dotnet test /Users/dohertj2/Desktop/suitelinkclient/SuiteLink.Client.slnx --filter SuiteLinkClientReconnectTests -v minimal`
Expected: PASS
**Step 5: Commit**
```bash
git add /Users/dohertj2/Desktop/suitelinkclient/src/SuiteLink.Client/SuiteLinkClient.cs /Users/dohertj2/Desktop/suitelinkclient/src/SuiteLink.Client/Internal/SubscriptionRegistrationEntry.cs /Users/dohertj2/Desktop/suitelinkclient/tests/SuiteLink.Client.Tests/SuiteLinkClientReconnectTests.cs
git commit -m "feat: add reconnect catch-up refresh replay"
```
### Task 8: Make Catch-Up Partial Failure Non-Fatal
**Files:**
- Modify: `/Users/dohertj2/Desktop/suitelinkclient/src/SuiteLink.Client/SuiteLinkClient.cs`
- Modify: `/Users/dohertj2/Desktop/suitelinkclient/tests/SuiteLink.Client.Tests/SuiteLinkClientReconnectTests.cs`
**Step 1: Write the failing test**
```csharp
[Fact]
public async Task Reconnect_CatchUpTimeout_DoesNotFailRecoveredSubscriptions()
{
var client = CreateReconnectReplayClientWithTimedOutRefresh();
await client.ConnectAsync(CreateOptionsWithCatchUp());
await Eventually.AssertAsync(() => Assert.True(client.IsConnected));
}
```
**Step 2: Run test to verify it fails**
Run: `dotnet test /Users/dohertj2/Desktop/suitelinkclient/SuiteLink.Client.slnx --filter Reconnect_CatchUpTimeout_DoesNotFailRecoveredSubscriptions -v minimal`
Expected: FAIL if catch-up failure tears down reconnect
**Step 3: Write minimal implementation**
Wrap each refresh item independently:
- timeout per item from `Runtime.CatchUpTimeout`
- swallow per-item failure after optionally recording internal debug signal
- continue to remaining items
Do not change the recovered `Ready`/`Subscribed` state.
**Step 4: Run test to verify it passes**
Run: `dotnet test /Users/dohertj2/Desktop/suitelinkclient/SuiteLink.Client.slnx --filter SuiteLinkClientReconnectTests -v minimal`
Expected: PASS
**Step 5: Commit**
```bash
git add /Users/dohertj2/Desktop/suitelinkclient/src/SuiteLink.Client/SuiteLinkClient.cs /Users/dohertj2/Desktop/suitelinkclient/tests/SuiteLink.Client.Tests/SuiteLinkClientReconnectTests.cs
git commit -m "feat: tolerate partial catch-up refresh failures"
```
### Task 9: Add Jitter Coverage Without Flaky Tests
**Files:**
- Modify: `/Users/dohertj2/Desktop/suitelinkclient/src/SuiteLink.Client/Internal/SuiteLinkRetryDelayCalculator.cs`
- Modify: `/Users/dohertj2/Desktop/suitelinkclient/tests/SuiteLink.Client.Tests/Internal/SuiteLinkRetryDelayCalculatorTests.cs`
**Step 1: Write the failing test**
```csharp
[Fact]
public void GetDelay_WithJitterEnabled_StaysWithinCap()
{
var policy = new SuiteLinkRetryPolicy(
InitialDelay: TimeSpan.FromSeconds(2),
Multiplier: 2.0,
MaxDelay: TimeSpan.FromSeconds(10),
UseJitter: true);
var delay = SuiteLinkRetryDelayCalculator.GetDelay(policy, 3, () => 0.5);
Assert.InRange(delay, TimeSpan.Zero, TimeSpan.FromSeconds(10));
}
```
**Step 2: Run test to verify it fails**
Run: `dotnet test /Users/dohertj2/Desktop/suitelinkclient/SuiteLink.Client.slnx --filter SuiteLinkRetryDelayCalculatorTests -v minimal`
Expected: FAIL because jitter injection does not exist yet
**Step 3: Write minimal implementation**
Add an injected random source overload:
```csharp
public static TimeSpan GetDelay(SuiteLinkRetryPolicy policy, int attempt, Func<double>? nextDouble = null)
```
When jitter is enabled:
- compute bounded base delay
- apply deterministic injected random value in tests
- keep final value within `[0, MaxDelay]`
**Step 4: Run test to verify it passes**
Run: `dotnet test /Users/dohertj2/Desktop/suitelinkclient/SuiteLink.Client.slnx --filter SuiteLinkRetryDelayCalculatorTests -v minimal`
Expected: PASS
**Step 5: Commit**
```bash
git add /Users/dohertj2/Desktop/suitelinkclient/src/SuiteLink.Client/Internal/SuiteLinkRetryDelayCalculator.cs /Users/dohertj2/Desktop/suitelinkclient/tests/SuiteLink.Client.Tests/Internal/SuiteLinkRetryDelayCalculatorTests.cs
git commit -m "feat: add deterministic jitter coverage for retry policy"
```
### Task 10: Update Documentation And Final Verification
**Files:**
- Modify: `/Users/dohertj2/Desktop/suitelinkclient/README.md`
- Modify: `/Users/dohertj2/Desktop/suitelinkclient/tests/SuiteLink.Client.IntegrationTests/README.md`
- Modify: `/Users/dohertj2/Desktop/suitelinkclient/docs/plans/2026-03-17-catchup-retry-design.md`
- Modify: `/Users/dohertj2/Desktop/suitelinkclient/docs/plans/2026-03-17-catchup-retry-implementation-plan.md`
**Step 1: Write the documentation diff**
Document:
- catch-up mode is latest-value refresh only
- retry policy is configurable and jittered by default
- reconnect success is separate from best-effort catch-up completion
- writes still fail during reconnect
**Step 2: Run targeted verification**
Run: `rg -n "catch-up|retry|reconnect|jitter|refresh latest|reconnecting" /Users/dohertj2/Desktop/suitelinkclient/README.md /Users/dohertj2/Desktop/suitelinkclient/tests/SuiteLink.Client.IntegrationTests/README.md`
Expected: PASS with updated wording
**Step 3: Run full verification**
Run: `dotnet test /Users/dohertj2/Desktop/suitelinkclient/SuiteLink.Client.slnx -v minimal`
Expected: PASS
**Step 4: Commit**
```bash
git add /Users/dohertj2/Desktop/suitelinkclient/README.md /Users/dohertj2/Desktop/suitelinkclient/tests/SuiteLink.Client.IntegrationTests/README.md /Users/dohertj2/Desktop/suitelinkclient/docs/plans/2026-03-17-catchup-retry-design.md /Users/dohertj2/Desktop/suitelinkclient/docs/plans/2026-03-17-catchup-retry-implementation-plan.md
git commit -m "docs: describe catch-up replay and retry policy"
```

View File

@@ -0,0 +1,225 @@
# SuiteLink Runtime Reconnect Design
## Goal
Add a background receive loop and automatic reconnect/recovery to the existing SuiteLink client so subscriptions are restored automatically and update callbacks resume without caller intervention.
## Scope
This design adds:
- a background receive loop owned by `SuiteLinkClient`
- automatic reconnect with bounded retry backoff
- automatic subscription replay after reconnect
- resumed update dispatch after replay
This design does not add:
- write queuing during reconnect
- catch-up replay of missed values
- secure SuiteLink V3/TLS support
- AlarmMgr support
## Runtime Model
The current client uses explicit on-demand inbound processing. The new model shifts normal operation to a managed runtime loop.
There are two categories of state:
- durable desired state
- configured connection options
- caller subscription intent
- callbacks associated with subscribed items
- ephemeral connection state
- current transport connection
- current session state
- current `itemName <-> tagId` mappings
Durable state survives reconnects. Ephemeral state is rebuilt on reconnect.
## Recommended Approach
Implement a supervised background receive loop inside `SuiteLinkClient`.
Behavior:
1. `ConnectAsync` establishes the initial transport/session and starts the receive loop.
2. The receive loop reads frames continuously.
3. Update frames are decoded and dispatched to user callbacks.
4. EOF, transport exceptions, malformed frames, or replay failures trigger recovery.
5. Recovery reconnects with bounded retry delays.
6. After reconnect succeeds, the client replays all current subscriptions and resumes dispatching.
This keeps the public API simple and avoids forcing callers to manually poll `ProcessIncomingAsync`.
## State Model
Expand session/client lifecycle to distinguish pending vs ready vs reconnecting:
- `Disconnected`
- `Connecting`
- `ConnectSent`
- `Ready`
- `Reconnecting`
- `Faulted`
- `Disposed`
Definitions:
- `Connecting`: transport connect + handshake in progress
- `ConnectSent`: startup connect has been sent but the runtime is not yet considered ready
- `Ready`: background receive loop active and subscriptions can be served normally
- `Reconnecting`: recovery loop active after a connection failure
`IsConnected` should reflect `Ready` only.
## Recovery Policy
Failure triggers:
- transport read returns `0`
- transport exception while sending or receiving
- malformed or unexpected frame during active runtime
- reconnect replay failure
Recovery behavior:
- stop the current receive loop
- mark the ephemeral session as disconnected/faulted
- start reconnect attempts until success or explicit shutdown
Retry schedule:
- first retry immediately
- then bounded retry delays such as:
- 1 second
- 2 seconds
- 5 seconds
- 10 seconds
- cap the delay instead of growing without bound
Writes during `Reconnecting` are rejected with a clear exception.
## Subscription Replay
The client should maintain a durable subscription registry keyed by `itemName`.
Each entry stores:
- `itemName`
- callback
- requested tag id
During reconnect:
1. reconnect transport
2. send handshake
3. send connect
4. replay every subscribed item via `ADVISE`
5. rebuild live session mappings from fresh ACKs
6. transition to `Ready`
Subscription replay is serialized and must not run concurrently with normal writes or new replay attempts.
## Callback Rules
Callbacks must never run under client locks or gates.
Rules:
- decode frames under internal synchronization
- dispatch callbacks only after releasing gates
- callback exceptions remain contained and do not crash the receive loop
## Public API Effects
Expected public behavior:
- `ConnectAsync`
- establishes initial runtime and starts background receive
- `SubscribeAsync`
- records durable intent
- advises immediately when ready
- keeps durable subscription for replay after reconnect
- `ReadAsync`
- can remain implemented as a temporary subscription
- should still use the background runtime instead of manual caller polling
- `WriteAsync`
- allowed only in `Ready`
- fails during `Reconnecting`
- `DisconnectAsync`
- stops receive and reconnect tasks
- tears down transport
`ProcessIncomingAsync` should stop being the primary runtime API. It can be retained only as an internal/test helper if still useful.
## Internal Changes
### `SuiteLinkClient`
Add:
- receive loop task
- reconnect supervisor task or integrated recovery loop
- cancellation tokens for runtime shutdown
- durable subscription registry
- reconnect backoff helper
Responsibilities:
- own runtime lifecycle
- coordinate reconnect attempts
- replay subscriptions safely
- ensure only one receive loop and one reconnect flow are active
### `SuiteLinkSession`
Continue to manage:
- live connection/session state
- current `itemName <-> tagId` mappings
- live dispatch helpers
Do not make it responsible for durable reconnect intent.
### `SubscriptionHandle`
Should continue to remove durable subscription intent and trigger `UNADVISE` when possible.
If called during reconnect/disconnect, removal of durable intent still succeeds even if wire unadvise cannot be sent.
## Testing Strategy
### Runtime Loop Tests
Add tests proving:
- updates received by the background loop reach callbacks
- no manual `ProcessIncomingAsync` call is needed in normal operation
### Recovery Tests
Add tests proving:
- EOF triggers reconnect
- reconnect replays handshake/connect/subscriptions
- callback dispatch resumes after reconnect
- writes during reconnect fail predictably
### Lifecycle Tests
Add tests proving:
- `DisconnectAsync` stops background tasks
- `DisposeAsync` stops reconnect attempts
- repeated failures do not start multiple reconnect loops
## Recommended Next Step
Create an implementation plan that breaks this into small tasks:
- durable subscription registry
- background receive loop
- reconnect loop and backoff
- replay logic
- runtime tests

View File

@@ -0,0 +1,519 @@
# SuiteLink Runtime Reconnect Implementation Plan
> **For Claude:** REQUIRED SUB-SKILL: Use superpowers:executing-plans to implement this plan task-by-task.
**Goal:** Add a background receive loop with automatic reconnect and subscription replay so the client continues dispatching updates after transport/session failures.
**Architecture:** The implementation extends `SuiteLinkClient` with a supervised runtime loop and reconnect flow while keeping durable subscription intent separate from ephemeral session mappings. Recovery rebuilds transport/session state, replays subscriptions, and resumes update dispatch without caller polling.
**Tech Stack:** .NET 10, C#, xUnit, `SemaphoreSlim`, `CancellationTokenSource`, existing SuiteLink codec/session/transport layers
---
### Task 1: Add Durable Subscription Registry
**Files:**
- Create: `/Users/dohertj2/Desktop/suitelinkclient/src/SuiteLink.Client/Internal/SubscriptionRegistrationEntry.cs`
- Modify: `/Users/dohertj2/Desktop/suitelinkclient/src/SuiteLink.Client/SuiteLinkClient.cs`
- Test: `/Users/dohertj2/Desktop/suitelinkclient/tests/SuiteLink.Client.Tests/SuiteLinkClientSubscriptionRegistryTests.cs`
**Step 1: Write the failing test**
```csharp
[Fact]
public async Task SubscribeAsync_StoresDurableSubscriptionIntent()
{
var client = TestClientFactory.CreateReadyClient();
await client.SubscribeAsync("Pump001.Run", _ => { });
Assert.True(client.HasSubscription("Pump001.Run"));
}
```
**Step 2: Run test to verify it fails**
Run: `dotnet test /Users/dohertj2/Desktop/suitelinkclient/SuiteLink.Client.slnx --filter SuiteLinkClientSubscriptionRegistryTests -v minimal`
Expected: FAIL with missing durable registry behavior
**Step 3: Write minimal implementation**
Add a durable registry entry model storing:
- `ItemName`
- callback
- requested tag id
Store these entries in `SuiteLinkClient` separately from `SuiteLinkSession`.
**Step 4: Run test to verify it passes**
Run: `dotnet test /Users/dohertj2/Desktop/suitelinkclient/SuiteLink.Client.slnx --filter SuiteLinkClientSubscriptionRegistryTests -v minimal`
Expected: PASS
**Step 5: Commit**
```bash
git add /Users/dohertj2/Desktop/suitelinkclient/src/SuiteLink.Client/Internal/SubscriptionRegistrationEntry.cs /Users/dohertj2/Desktop/suitelinkclient/src/SuiteLink.Client/SuiteLinkClient.cs /Users/dohertj2/Desktop/suitelinkclient/tests/SuiteLink.Client.Tests/SuiteLinkClientSubscriptionRegistryTests.cs
git commit -m "feat: add durable subscription registry"
```
### Task 2: Make Subscription Handles Remove Durable Intent
**Files:**
- Modify: `/Users/dohertj2/Desktop/suitelinkclient/src/SuiteLink.Client/SuiteLinkClient.cs`
- Modify: `/Users/dohertj2/Desktop/suitelinkclient/src/SuiteLink.Client/SubscriptionHandle.cs`
- Test: `/Users/dohertj2/Desktop/suitelinkclient/tests/SuiteLink.Client.Tests/SuiteLinkClientSubscriptionRegistryTests.cs`
**Step 1: Write the failing test**
```csharp
[Fact]
public async Task DisposingSubscription_RemovesDurableSubscriptionIntent()
{
var client = TestClientFactory.CreateReadyClient();
var handle = await client.SubscribeAsync("Pump001.Run", _ => { });
await handle.DisposeAsync();
Assert.False(client.HasSubscription("Pump001.Run"));
}
```
**Step 2: Run test to verify it fails**
Run: `dotnet test /Users/dohertj2/Desktop/suitelinkclient/SuiteLink.Client.slnx --filter DisposingSubscription_RemovesDurableSubscriptionIntent -v minimal`
Expected: FAIL
**Step 3: Write minimal implementation**
Ensure handle disposal removes durable registry entries even when wire unadvise cannot be sent.
**Step 4: Run test to verify it passes**
Run: `dotnet test /Users/dohertj2/Desktop/suitelinkclient/SuiteLink.Client.slnx --filter SuiteLinkClientSubscriptionRegistryTests -v minimal`
Expected: PASS
**Step 5: Commit**
```bash
git add /Users/dohertj2/Desktop/suitelinkclient/src/SuiteLink.Client/SuiteLinkClient.cs /Users/dohertj2/Desktop/suitelinkclient/src/SuiteLink.Client/SubscriptionHandle.cs /Users/dohertj2/Desktop/suitelinkclient/tests/SuiteLink.Client.Tests/SuiteLinkClientSubscriptionRegistryTests.cs
git commit -m "feat: persist subscription intent across reconnects"
```
### Task 3: Add Runtime State For Background Loop
**Files:**
- Modify: `/Users/dohertj2/Desktop/suitelinkclient/src/SuiteLink.Client/Internal/SuiteLinkSessionState.cs`
- Modify: `/Users/dohertj2/Desktop/suitelinkclient/src/SuiteLink.Client/SuiteLinkClient.cs`
- Test: `/Users/dohertj2/Desktop/suitelinkclient/tests/SuiteLink.Client.Tests/SuiteLinkClientConnectionTests.cs`
**Step 1: Write the failing test**
```csharp
[Fact]
public async Task ConnectAsync_TransitionsToReadyOnlyAfterRuntimeStarts()
{
var client = TestClientFactory.CreateReadyHandshakeClient();
await client.ConnectAsync(TestOptions.Create());
Assert.True(client.IsConnected);
}
```
**Step 2: Run test to verify it fails**
Run: `dotnet test /Users/dohertj2/Desktop/suitelinkclient/SuiteLink.Client.slnx --filter ConnectAsync_TransitionsToReadyOnlyAfterRuntimeStarts -v minimal`
Expected: FAIL with missing ready/runtime state
**Step 3: Write minimal implementation**
Add:
- `Ready`
- `Reconnecting`
and transition `ConnectAsync` into `Ready` when the runtime loop has been established.
**Step 4: Run test to verify it passes**
Run: `dotnet test /Users/dohertj2/Desktop/suitelinkclient/SuiteLink.Client.slnx --filter SuiteLinkClientConnectionTests -v minimal`
Expected: PASS
**Step 5: Commit**
```bash
git add /Users/dohertj2/Desktop/suitelinkclient/src/SuiteLink.Client/Internal/SuiteLinkSessionState.cs /Users/dohertj2/Desktop/suitelinkclient/src/SuiteLink.Client/SuiteLinkClient.cs /Users/dohertj2/Desktop/suitelinkclient/tests/SuiteLink.Client.Tests/SuiteLinkClientConnectionTests.cs
git commit -m "feat: add ready and reconnecting runtime states"
```
### Task 4: Start Background Receive Loop
**Files:**
- Modify: `/Users/dohertj2/Desktop/suitelinkclient/src/SuiteLink.Client/SuiteLinkClient.cs`
- Test: `/Users/dohertj2/Desktop/suitelinkclient/tests/SuiteLink.Client.Tests/SuiteLinkClientRuntimeLoopTests.cs`
**Step 1: Write the failing test**
```csharp
[Fact]
public async Task ConnectAsync_StartsBackgroundLoop_AndDispatchesUpdateWithoutManualPolling()
{
var updateReceived = new TaskCompletionSource<SuiteLinkTagUpdate>();
var client = TestClientFactory.CreateClientWithQueuedUpdate(updateReceived);
await client.ConnectAsync(TestOptions.Create());
await client.SubscribeAsync("Pump001.Run", update => updateReceived.TrySetResult(update));
var update = await updateReceived.Task.WaitAsync(TimeSpan.FromSeconds(1));
Assert.True(update.Value.TryGetBoolean(out var value) && value);
}
```
**Step 2: Run test to verify it fails**
Run: `dotnet test /Users/dohertj2/Desktop/suitelinkclient/SuiteLink.Client.slnx --filter SuiteLinkClientRuntimeLoopTests -v minimal`
Expected: FAIL because manual processing is still required
**Step 3: Write minimal implementation**
Start a long-lived receive loop task after initial connect, and dispatch updates through existing session logic.
**Step 4: Run test to verify it passes**
Run: `dotnet test /Users/dohertj2/Desktop/suitelinkclient/SuiteLink.Client.slnx --filter SuiteLinkClientRuntimeLoopTests -v minimal`
Expected: PASS
**Step 5: Commit**
```bash
git add /Users/dohertj2/Desktop/suitelinkclient/src/SuiteLink.Client/SuiteLinkClient.cs /Users/dohertj2/Desktop/suitelinkclient/tests/SuiteLink.Client.Tests/SuiteLinkClientRuntimeLoopTests.cs
git commit -m "feat: add suitelink background receive loop"
```
### Task 5: Make ProcessIncomingAsync Internal Or Non-Primary
**Files:**
- Modify: `/Users/dohertj2/Desktop/suitelinkclient/src/SuiteLink.Client/SuiteLinkClient.cs`
- Modify: `/Users/dohertj2/Desktop/suitelinkclient/README.md`
- Test: `/Users/dohertj2/Desktop/suitelinkclient/tests/SuiteLink.Client.Tests/SuiteLinkClientRuntimeLoopTests.cs`
**Step 1: Write the failing documentation/runtime check**
Define the intended runtime contract:
- normal operation uses background receive
- manual polling is not required for normal subscriptions
**Step 2: Run targeted tests**
Run: `dotnet test /Users/dohertj2/Desktop/suitelinkclient/SuiteLink.Client.slnx --filter SuiteLinkClientRuntimeLoopTests -v minimal`
Expected: PASS after Task 4
**Step 3: Write minimal implementation**
Keep `ProcessIncomingAsync` only as an internal/test helper or document it as non-primary API.
**Step 4: Run test and docs verification**
Run: `rg -n "background receive|manual polling|ProcessIncomingAsync" /Users/dohertj2/Desktop/suitelinkclient/README.md`
Expected: PASS with updated wording
**Step 5: Commit**
```bash
git add /Users/dohertj2/Desktop/suitelinkclient/src/SuiteLink.Client/SuiteLinkClient.cs /Users/dohertj2/Desktop/suitelinkclient/README.md
git commit -m "docs: describe background runtime model"
```
### Task 6: Detect EOF And Trigger Reconnect
**Files:**
- Modify: `/Users/dohertj2/Desktop/suitelinkclient/src/SuiteLink.Client/SuiteLinkClient.cs`
- Test: `/Users/dohertj2/Desktop/suitelinkclient/tests/SuiteLink.Client.Tests/SuiteLinkClientReconnectTests.cs`
**Step 1: Write the failing test**
```csharp
[Fact]
public async Task ReceiveLoop_Eof_TransitionsToReconnecting()
{
var client = TestClientFactory.CreateClientThatEofsAfterConnect();
await client.ConnectAsync(TestOptions.Create());
await Eventually.AssertAsync(() => Assert.Equal(SuiteLinkSessionState.Reconnecting, client.DebugState));
}
```
**Step 2: Run test to verify it fails**
Run: `dotnet test /Users/dohertj2/Desktop/suitelinkclient/SuiteLink.Client.slnx --filter SuiteLinkClientReconnectTests -v minimal`
Expected: FAIL
**Step 3: Write minimal implementation**
Treat `ReceiveAsync == 0` as a disconnect trigger and start recovery.
**Step 4: Run test to verify it passes**
Run: `dotnet test /Users/dohertj2/Desktop/suitelinkclient/SuiteLink.Client.slnx --filter SuiteLinkClientReconnectTests -v minimal`
Expected: PASS
**Step 5: Commit**
```bash
git add /Users/dohertj2/Desktop/suitelinkclient/src/SuiteLink.Client/SuiteLinkClient.cs /Users/dohertj2/Desktop/suitelinkclient/tests/SuiteLink.Client.Tests/SuiteLinkClientReconnectTests.cs
git commit -m "feat: detect disconnects and enter reconnect state"
```
### Task 7: Add Bounded Reconnect Backoff
**Files:**
- Modify: `/Users/dohertj2/Desktop/suitelinkclient/src/SuiteLink.Client/SuiteLinkClient.cs`
- Test: `/Users/dohertj2/Desktop/suitelinkclient/tests/SuiteLink.Client.Tests/SuiteLinkClientReconnectTests.cs`
**Step 1: Write the failing test**
```csharp
[Fact]
public async Task Reconnect_UsesBoundedRetrySchedule()
{
var delays = new List<TimeSpan>();
var client = TestClientFactory.CreateReconnectTestClient(delays);
await client.ConnectAsync(TestOptions.Create());
Assert.Contains(TimeSpan.FromSeconds(1), delays);
}
```
**Step 2: Run test to verify it fails**
Run: `dotnet test /Users/dohertj2/Desktop/suitelinkclient/SuiteLink.Client.slnx --filter Reconnect_UsesBoundedRetrySchedule -v minimal`
Expected: FAIL
**Step 3: Write minimal implementation**
Add a small capped delay schedule:
- 0s
- 1s
- 2s
- 5s
- 10s capped
Inject delay behavior for tests if needed.
**Step 4: Run test to verify it passes**
Run: `dotnet test /Users/dohertj2/Desktop/suitelinkclient/SuiteLink.Client.slnx --filter SuiteLinkClientReconnectTests -v minimal`
Expected: PASS
**Step 5: Commit**
```bash
git add /Users/dohertj2/Desktop/suitelinkclient/src/SuiteLink.Client/SuiteLinkClient.cs /Users/dohertj2/Desktop/suitelinkclient/tests/SuiteLink.Client.Tests/SuiteLinkClientReconnectTests.cs
git commit -m "feat: add bounded reconnect backoff"
```
### Task 8: Replay Subscriptions After Reconnect
**Files:**
- Modify: `/Users/dohertj2/Desktop/suitelinkclient/src/SuiteLink.Client/SuiteLinkClient.cs`
- Modify: `/Users/dohertj2/Desktop/suitelinkclient/src/SuiteLink.Client/Internal/SuiteLinkSession.cs`
- Test: `/Users/dohertj2/Desktop/suitelinkclient/tests/SuiteLink.Client.Tests/SuiteLinkClientReconnectTests.cs`
**Step 1: Write the failing test**
```csharp
[Fact]
public async Task Reconnect_ReplaysSubscriptions_AndRestoresDispatch()
{
var callbackCount = 0;
var client = TestClientFactory.CreateReconnectReplayClient(() => callbackCount++);
await client.ConnectAsync(TestOptions.Create());
await client.SubscribeAsync("Pump001.Run", _ => callbackCount++);
await client.WaitForReconnectReadyAsync();
Assert.True(callbackCount > 0);
}
```
**Step 2: Run test to verify it fails**
Run: `dotnet test /Users/dohertj2/Desktop/suitelinkclient/SuiteLink.Client.slnx --filter Reconnect_ReplaysSubscriptions_AndRestoresDispatch -v minimal`
Expected: FAIL
**Step 3: Write minimal implementation**
On successful reconnect:
- reset live session mappings
- replay all durable subscriptions one-by-one
- rebuild tag mappings from fresh ACKs
- return to `Ready`
**Step 4: Run test to verify it passes**
Run: `dotnet test /Users/dohertj2/Desktop/suitelinkclient/SuiteLink.Client.slnx --filter SuiteLinkClientReconnectTests -v minimal`
Expected: PASS
**Step 5: Commit**
```bash
git add /Users/dohertj2/Desktop/suitelinkclient/src/SuiteLink.Client/SuiteLinkClient.cs /Users/dohertj2/Desktop/suitelinkclient/src/SuiteLink.Client/Internal/SuiteLinkSession.cs /Users/dohertj2/Desktop/suitelinkclient/tests/SuiteLink.Client.Tests/SuiteLinkClientReconnectTests.cs
git commit -m "feat: replay subscriptions after reconnect"
```
### Task 9: Reject Writes During Reconnect
**Files:**
- Modify: `/Users/dohertj2/Desktop/suitelinkclient/src/SuiteLink.Client/SuiteLinkClient.cs`
- Test: `/Users/dohertj2/Desktop/suitelinkclient/tests/SuiteLink.Client.Tests/SuiteLinkClientWriteTests.cs`
**Step 1: Write the failing test**
```csharp
[Fact]
public async Task WriteAsync_DuringReconnect_ThrowsClearException()
{
var client = TestClientFactory.CreateReconnectingClient();
await Assert.ThrowsAsync<InvalidOperationException>(() =>
client.WriteAsync("Pump001.Run", SuiteLinkValue.FromBoolean(true)));
}
```
**Step 2: Run test to verify it fails**
Run: `dotnet test /Users/dohertj2/Desktop/suitelinkclient/SuiteLink.Client.slnx --filter WriteAsync_DuringReconnect_ThrowsClearException -v minimal`
Expected: FAIL
**Step 3: Write minimal implementation**
Guard `WriteAsync` so it succeeds only in `Ready`.
**Step 4: Run test to verify it passes**
Run: `dotnet test /Users/dohertj2/Desktop/suitelinkclient/SuiteLink.Client.slnx --filter SuiteLinkClientWriteTests -v minimal`
Expected: PASS
**Step 5: Commit**
```bash
git add /Users/dohertj2/Desktop/suitelinkclient/src/SuiteLink.Client/SuiteLinkClient.cs /Users/dohertj2/Desktop/suitelinkclient/tests/SuiteLink.Client.Tests/SuiteLinkClientWriteTests.cs
git commit -m "feat: reject writes while reconnecting"
```
### Task 10: Stop Runtime Cleanly On Disconnect
**Files:**
- Modify: `/Users/dohertj2/Desktop/suitelinkclient/src/SuiteLink.Client/SuiteLinkClient.cs`
- Test: `/Users/dohertj2/Desktop/suitelinkclient/tests/SuiteLink.Client.Tests/SuiteLinkClientConnectionTests.cs`
**Step 1: Write the failing test**
```csharp
[Fact]
public async Task DisconnectAsync_StopsReceiveAndReconnectLoops()
{
var client = TestClientFactory.CreateRunningClient();
await client.ConnectAsync(TestOptions.Create());
await client.DisconnectAsync();
Assert.False(client.IsConnected);
}
```
**Step 2: Run test to verify it fails**
Run: `dotnet test /Users/dohertj2/Desktop/suitelinkclient/SuiteLink.Client.slnx --filter DisconnectAsync_StopsReceiveAndReconnectLoops -v minimal`
Expected: FAIL
**Step 3: Write minimal implementation**
Cancel runtime loop tokens and stop reconnect attempts on disconnect/dispose.
**Step 4: Run test to verify it passes**
Run: `dotnet test /Users/dohertj2/Desktop/suitelinkclient/SuiteLink.Client.slnx --filter SuiteLinkClientConnectionTests -v minimal`
Expected: PASS
**Step 5: Commit**
```bash
git add /Users/dohertj2/Desktop/suitelinkclient/src/SuiteLink.Client/SuiteLinkClient.cs /Users/dohertj2/Desktop/suitelinkclient/tests/SuiteLink.Client.Tests/SuiteLinkClientConnectionTests.cs
git commit -m "feat: stop runtime loops on disconnect"
```
### Task 11: Update README And Integration Docs
**Files:**
- Modify: `/Users/dohertj2/Desktop/suitelinkclient/README.md`
- Modify: `/Users/dohertj2/Desktop/suitelinkclient/tests/SuiteLink.Client.IntegrationTests/README.md`
**Step 1: Write the failing documentation check**
Define required README terms:
- background receive loop
- automatic reconnect
- subscription replay
- writes rejected during reconnect
**Step 2: Run documentation review**
Run: `rg -n "background receive|automatic reconnect|subscription replay|reconnecting" /Users/dohertj2/Desktop/suitelinkclient/README.md /Users/dohertj2/Desktop/suitelinkclient/tests/SuiteLink.Client.IntegrationTests/README.md`
Expected: FAIL until docs are updated
**Step 3: Write minimal implementation**
Update docs to describe the runtime model and recovery behavior honestly.
**Step 4: Run documentation review**
Run: `rg -n "background receive|automatic reconnect|subscription replay|reconnecting" /Users/dohertj2/Desktop/suitelinkclient/README.md /Users/dohertj2/Desktop/suitelinkclient/tests/SuiteLink.Client.IntegrationTests/README.md`
Expected: PASS
**Step 5: Commit**
```bash
git add /Users/dohertj2/Desktop/suitelinkclient/README.md /Users/dohertj2/Desktop/suitelinkclient/tests/SuiteLink.Client.IntegrationTests/README.md
git commit -m "docs: describe runtime reconnect behavior"
```
### Task 12: Full Verification Pass
**Files:**
- Modify: `/Users/dohertj2/Desktop/suitelinkclient/docs/plans/2026-03-17-runtime-reconnect-design.md`
- Modify: `/Users/dohertj2/Desktop/suitelinkclient/docs/plans/2026-03-17-runtime-reconnect-implementation-plan.md`
**Step 1: Run full test suite**
Run: `dotnet test /Users/dohertj2/Desktop/suitelinkclient/SuiteLink.Client.slnx -v minimal`
Expected: PASS with integration harness still conditional by default
**Step 2: Run release build**
Run: `dotnet build /Users/dohertj2/Desktop/suitelinkclient/SuiteLink.Client.slnx -c Release`
Expected: PASS
**Step 3: Run reconnect-focused tests**
Run: `dotnet test /Users/dohertj2/Desktop/suitelinkclient/SuiteLink.Client.slnx --filter Reconnect -v minimal`
Expected: PASS
**Step 4: Update plan notes if implementation deviated**
Add short notes to the design/plan docs if final runtime behavior differs from original assumptions.
**Step 5: Commit**
```bash
git add /Users/dohertj2/Desktop/suitelinkclient/docs/plans/2026-03-17-runtime-reconnect-design.md /Users/dohertj2/Desktop/suitelinkclient/docs/plans/2026-03-17-runtime-reconnect-implementation-plan.md
git commit -m "docs: finalize reconnect implementation verification"
```