feat: add resilient reconnect and catch-up replay
This commit is contained in:
583
docs/plans/2026-03-17-catchup-retry-implementation-plan.md
Normal file
583
docs/plans/2026-03-17-catchup-retry-implementation-plan.md
Normal file
@@ -0,0 +1,583 @@
|
||||
# Catch-Up Replay And Advanced Retry Implementation Plan
|
||||
|
||||
> **For Codex:** REQUIRED SUB-SKILL: Use `executeplan` to implement this plan task-by-task.
|
||||
|
||||
**Goal:** Add best-effort latest-value catch-up after reconnect and replace the fixed reconnect delay schedule with a production-grade retry policy, while also fixing the current reconnect quality issues.
|
||||
|
||||
**Architecture:** Extend the existing reconnect runtime with a small runtime-options layer, a retry-policy calculator, and a post-reconnect catch-up refresh phase. Keep reconnect success defined as restored live subscriptions, and treat catch-up as a best-effort follow-on phase that emits synthetic updates marked separately from live traffic.
|
||||
|
||||
**Tech Stack:** .NET 10, C#, xUnit, existing SuiteLink protocol/client/runtime/transport layers
|
||||
|
||||
---
|
||||
|
||||
### Task 1: Add Runtime Option Types
|
||||
|
||||
**Files:**
|
||||
- Create: `/Users/dohertj2/Desktop/suitelinkclient/src/SuiteLink.Client/SuiteLinkRuntimeOptions.cs`
|
||||
- Create: `/Users/dohertj2/Desktop/suitelinkclient/src/SuiteLink.Client/SuiteLinkRetryPolicy.cs`
|
||||
- Create: `/Users/dohertj2/Desktop/suitelinkclient/src/SuiteLink.Client/SuiteLinkCatchUpPolicy.cs`
|
||||
- Modify: `/Users/dohertj2/Desktop/suitelinkclient/src/SuiteLink.Client/SuiteLinkConnectionOptions.cs`
|
||||
- Test: `/Users/dohertj2/Desktop/suitelinkclient/tests/SuiteLink.Client.Tests/SuiteLinkConnectionOptionsTests.cs`
|
||||
|
||||
**Step 1: Write the failing test**
|
||||
|
||||
```csharp
|
||||
[Fact]
|
||||
public void ConnectionOptions_DefaultsRuntimeOptions()
|
||||
{
|
||||
var options = new SuiteLinkConnectionOptions(
|
||||
host: "127.0.0.1",
|
||||
application: "App",
|
||||
topic: "Topic",
|
||||
clientName: "Client",
|
||||
clientNode: "Node",
|
||||
userName: "User",
|
||||
serverNode: "Server");
|
||||
|
||||
Assert.NotNull(options.Runtime);
|
||||
Assert.Equal(SuiteLinkCatchUpPolicy.None, options.Runtime.CatchUpPolicy);
|
||||
Assert.NotNull(options.Runtime.RetryPolicy);
|
||||
}
|
||||
```
|
||||
|
||||
**Step 2: Run test to verify it fails**
|
||||
|
||||
Run: `dotnet test /Users/dohertj2/Desktop/suitelinkclient/SuiteLink.Client.slnx --filter ConnectionOptions_DefaultsRuntimeOptions -v minimal`
|
||||
Expected: FAIL because runtime options do not exist yet
|
||||
|
||||
**Step 3: Write minimal implementation**
|
||||
|
||||
Create:
|
||||
|
||||
```csharp
|
||||
public enum SuiteLinkCatchUpPolicy
|
||||
{
|
||||
None = 0,
|
||||
RefreshLatestValue = 1
|
||||
}
|
||||
```
|
||||
|
||||
```csharp
|
||||
public sealed record class SuiteLinkRetryPolicy(
|
||||
TimeSpan InitialDelay,
|
||||
double Multiplier,
|
||||
TimeSpan MaxDelay,
|
||||
int? MaxAttempts = null,
|
||||
bool UseJitter = true)
|
||||
{
|
||||
public static SuiteLinkRetryPolicy Default { get; } =
|
||||
new(TimeSpan.FromSeconds(1), 2.0, TimeSpan.FromSeconds(30));
|
||||
}
|
||||
```
|
||||
|
||||
```csharp
|
||||
public sealed record class SuiteLinkRuntimeOptions(
|
||||
SuiteLinkRetryPolicy RetryPolicy,
|
||||
SuiteLinkCatchUpPolicy CatchUpPolicy,
|
||||
TimeSpan CatchUpTimeout)
|
||||
{
|
||||
public static SuiteLinkRuntimeOptions Default { get; } =
|
||||
new(SuiteLinkRetryPolicy.Default, SuiteLinkCatchUpPolicy.None, TimeSpan.FromSeconds(2));
|
||||
}
|
||||
```
|
||||
|
||||
Update `SuiteLinkConnectionOptions` to expose:
|
||||
|
||||
```csharp
|
||||
public SuiteLinkRuntimeOptions Runtime { get; }
|
||||
```
|
||||
|
||||
and default it to `SuiteLinkRuntimeOptions.Default`.
|
||||
|
||||
**Step 4: Run test to verify it passes**
|
||||
|
||||
Run: `dotnet test /Users/dohertj2/Desktop/suitelinkclient/SuiteLink.Client.slnx --filter SuiteLinkConnectionOptionsTests -v minimal`
|
||||
Expected: PASS
|
||||
|
||||
**Step 5: Commit**
|
||||
|
||||
```bash
|
||||
git add /Users/dohertj2/Desktop/suitelinkclient/src/SuiteLink.Client/SuiteLinkRuntimeOptions.cs /Users/dohertj2/Desktop/suitelinkclient/src/SuiteLink.Client/SuiteLinkRetryPolicy.cs /Users/dohertj2/Desktop/suitelinkclient/src/SuiteLink.Client/SuiteLinkCatchUpPolicy.cs /Users/dohertj2/Desktop/suitelinkclient/src/SuiteLink.Client/SuiteLinkConnectionOptions.cs /Users/dohertj2/Desktop/suitelinkclient/tests/SuiteLink.Client.Tests/SuiteLinkConnectionOptionsTests.cs
|
||||
git commit -m "feat: add runtime reconnect option types"
|
||||
```
|
||||
|
||||
### Task 2: Add Retry Policy Delay Calculator
|
||||
|
||||
**Files:**
|
||||
- Create: `/Users/dohertj2/Desktop/suitelinkclient/src/SuiteLink.Client/Internal/SuiteLinkRetryDelayCalculator.cs`
|
||||
- Test: `/Users/dohertj2/Desktop/suitelinkclient/tests/SuiteLink.Client.Tests/Internal/SuiteLinkRetryDelayCalculatorTests.cs`
|
||||
|
||||
**Step 1: Write the failing test**
|
||||
|
||||
```csharp
|
||||
[Fact]
|
||||
public void GetDelay_UsesImmediateThenExponentialCap()
|
||||
{
|
||||
var policy = new SuiteLinkRetryPolicy(
|
||||
InitialDelay: TimeSpan.FromSeconds(1),
|
||||
Multiplier: 2.0,
|
||||
MaxDelay: TimeSpan.FromSeconds(30),
|
||||
UseJitter: false);
|
||||
|
||||
Assert.Equal(TimeSpan.Zero, SuiteLinkRetryDelayCalculator.GetDelay(policy, 0));
|
||||
Assert.Equal(TimeSpan.FromSeconds(1), SuiteLinkRetryDelayCalculator.GetDelay(policy, 1));
|
||||
Assert.Equal(TimeSpan.FromSeconds(2), SuiteLinkRetryDelayCalculator.GetDelay(policy, 2));
|
||||
Assert.Equal(TimeSpan.FromSeconds(4), SuiteLinkRetryDelayCalculator.GetDelay(policy, 3));
|
||||
}
|
||||
```
|
||||
|
||||
**Step 2: Run test to verify it fails**
|
||||
|
||||
Run: `dotnet test /Users/dohertj2/Desktop/suitelinkclient/SuiteLink.Client.slnx --filter SuiteLinkRetryDelayCalculatorTests -v minimal`
|
||||
Expected: FAIL because calculator does not exist yet
|
||||
|
||||
**Step 3: Write minimal implementation**
|
||||
|
||||
Create:
|
||||
|
||||
```csharp
|
||||
internal static class SuiteLinkRetryDelayCalculator
|
||||
{
|
||||
public static TimeSpan GetDelay(SuiteLinkRetryPolicy policy, int attempt)
|
||||
{
|
||||
if (attempt == 0)
|
||||
{
|
||||
return TimeSpan.Zero;
|
||||
}
|
||||
|
||||
var rawSeconds = policy.InitialDelay.TotalSeconds * Math.Pow(policy.Multiplier, attempt - 1);
|
||||
var bounded = TimeSpan.FromSeconds(Math.Min(rawSeconds, policy.MaxDelay.TotalSeconds));
|
||||
return bounded;
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
Do not add jitter yet beyond the policy flag unless tests require it.
|
||||
|
||||
**Step 4: Run test to verify it passes**
|
||||
|
||||
Run: `dotnet test /Users/dohertj2/Desktop/suitelinkclient/SuiteLink.Client.slnx --filter SuiteLinkRetryDelayCalculatorTests -v minimal`
|
||||
Expected: PASS
|
||||
|
||||
**Step 5: Commit**
|
||||
|
||||
```bash
|
||||
git add /Users/dohertj2/Desktop/suitelinkclient/src/SuiteLink.Client/Internal/SuiteLinkRetryDelayCalculator.cs /Users/dohertj2/Desktop/suitelinkclient/tests/SuiteLink.Client.Tests/Internal/SuiteLinkRetryDelayCalculatorTests.cs
|
||||
git commit -m "feat: add reconnect retry delay calculator"
|
||||
```
|
||||
|
||||
### Task 3: Wire Retry Policy Into Reconnect Runtime
|
||||
|
||||
**Files:**
|
||||
- Modify: `/Users/dohertj2/Desktop/suitelinkclient/src/SuiteLink.Client/SuiteLinkClient.cs`
|
||||
- Modify: `/Users/dohertj2/Desktop/suitelinkclient/tests/SuiteLink.Client.Tests/SuiteLinkClientReconnectTests.cs`
|
||||
|
||||
**Step 1: Write the failing test**
|
||||
|
||||
```csharp
|
||||
[Fact]
|
||||
public async Task Reconnect_UsesConfiguredRetryPolicy()
|
||||
{
|
||||
var observed = new List<TimeSpan>();
|
||||
var options = CreateOptions() with
|
||||
{
|
||||
Runtime = new SuiteLinkRuntimeOptions(
|
||||
new SuiteLinkRetryPolicy(TimeSpan.FromSeconds(3), 3.0, TimeSpan.FromSeconds(20), UseJitter: false),
|
||||
SuiteLinkCatchUpPolicy.None,
|
||||
TimeSpan.FromSeconds(2))
|
||||
};
|
||||
|
||||
var client = CreateReconnectClient(delayAsync: (delay, _) =>
|
||||
{
|
||||
observed.Add(delay);
|
||||
return Task.CompletedTask;
|
||||
});
|
||||
|
||||
await client.ConnectAsync(options);
|
||||
await EventuallyReconnectAsync(client);
|
||||
|
||||
Assert.Contains(TimeSpan.FromSeconds(3), observed);
|
||||
}
|
||||
```
|
||||
|
||||
**Step 2: Run test to verify it fails**
|
||||
|
||||
Run: `dotnet test /Users/dohertj2/Desktop/suitelinkclient/SuiteLink.Client.slnx --filter Reconnect_UsesConfiguredRetryPolicy -v minimal`
|
||||
Expected: FAIL because reconnect still uses a fixed schedule
|
||||
|
||||
**Step 3: Write minimal implementation**
|
||||
|
||||
In `SuiteLinkClient`:
|
||||
|
||||
- remove direct use of `ReconnectDelaySchedule`
|
||||
- read retry policy from `_connectionOptions!.Runtime.RetryPolicy`
|
||||
- use `SuiteLinkRetryDelayCalculator.GetDelay(policy, attempt)`
|
||||
|
||||
Keep the current injected `_delayAsync` test seam.
|
||||
|
||||
**Step 4: Run test to verify it passes**
|
||||
|
||||
Run: `dotnet test /Users/dohertj2/Desktop/suitelinkclient/SuiteLink.Client.slnx --filter SuiteLinkClientReconnectTests -v minimal`
|
||||
Expected: PASS
|
||||
|
||||
**Step 5: Commit**
|
||||
|
||||
```bash
|
||||
git add /Users/dohertj2/Desktop/suitelinkclient/src/SuiteLink.Client/SuiteLinkClient.cs /Users/dohertj2/Desktop/suitelinkclient/tests/SuiteLink.Client.Tests/SuiteLinkClientReconnectTests.cs
|
||||
git commit -m "feat: apply retry policy to reconnect runtime"
|
||||
```
|
||||
|
||||
### Task 4: Fix Fast-Fail Writes During Reconnect
|
||||
|
||||
**Files:**
|
||||
- Modify: `/Users/dohertj2/Desktop/suitelinkclient/src/SuiteLink.Client/SuiteLinkClient.cs`
|
||||
- Modify: `/Users/dohertj2/Desktop/suitelinkclient/tests/SuiteLink.Client.Tests/SuiteLinkClientWriteTests.cs`
|
||||
|
||||
**Step 1: Write the failing test**
|
||||
|
||||
```csharp
|
||||
[Fact]
|
||||
public async Task WriteAsync_DuringReconnect_ThrowsBeforeWaitingOnOperationGate()
|
||||
{
|
||||
var client = CreateClientWithBlockedOperationGateAndReconnectState();
|
||||
|
||||
var ex = await Assert.ThrowsAsync<InvalidOperationException>(
|
||||
() => client.WriteAsync("Pump001.Run", SuiteLinkValue.FromBoolean(true)));
|
||||
|
||||
Assert.Contains("reconnecting", ex.Message, StringComparison.OrdinalIgnoreCase);
|
||||
}
|
||||
```
|
||||
|
||||
**Step 2: Run test to verify it fails**
|
||||
|
||||
Run: `dotnet test /Users/dohertj2/Desktop/suitelinkclient/SuiteLink.Client.slnx --filter WriteAsync_DuringReconnect_ThrowsBeforeWaitingOnOperationGate -v minimal`
|
||||
Expected: FAIL because `WriteAsync` currently waits on `_operationGate` first
|
||||
|
||||
**Step 3: Write minimal implementation**
|
||||
|
||||
Move the reconnect state check ahead of:
|
||||
|
||||
```csharp
|
||||
await _operationGate.WaitAsync(...)
|
||||
```
|
||||
|
||||
while keeping disposed-state checks intact.
|
||||
|
||||
**Step 4: Run test to verify it passes**
|
||||
|
||||
Run: `dotnet test /Users/dohertj2/Desktop/suitelinkclient/SuiteLink.Client.slnx --filter SuiteLinkClientWriteTests -v minimal`
|
||||
Expected: PASS
|
||||
|
||||
**Step 5: Commit**
|
||||
|
||||
```bash
|
||||
git add /Users/dohertj2/Desktop/suitelinkclient/src/SuiteLink.Client/SuiteLinkClient.cs /Users/dohertj2/Desktop/suitelinkclient/tests/SuiteLink.Client.Tests/SuiteLinkClientWriteTests.cs
|
||||
git commit -m "fix: fail writes before reconnect gate contention"
|
||||
```
|
||||
|
||||
### Task 5: Fix Transport Reset Ownership Semantics
|
||||
|
||||
**Files:**
|
||||
- Modify: `/Users/dohertj2/Desktop/suitelinkclient/src/SuiteLink.Client/Transport/SuiteLinkTcpTransport.cs`
|
||||
- Modify: `/Users/dohertj2/Desktop/suitelinkclient/src/SuiteLink.Client/Transport/ISuiteLinkReconnectableTransport.cs`
|
||||
- Modify: `/Users/dohertj2/Desktop/suitelinkclient/tests/SuiteLink.Client.Tests/Transport/SuiteLinkTcpTransportTests.cs`
|
||||
|
||||
**Step 1: Write the failing test**
|
||||
|
||||
```csharp
|
||||
[Fact]
|
||||
public async Task ResetConnectionAsync_LeaveOpenTrue_DoesNotDisposeInjectedStream()
|
||||
{
|
||||
var stream = new TrackingStream();
|
||||
await using var transport = new SuiteLinkTcpTransport(stream, leaveOpen: true);
|
||||
|
||||
await transport.ResetConnectionAsync();
|
||||
|
||||
Assert.False(stream.WasDisposed);
|
||||
}
|
||||
```
|
||||
|
||||
**Step 2: Run test to verify it fails**
|
||||
|
||||
Run: `dotnet test /Users/dohertj2/Desktop/suitelinkclient/SuiteLink.Client.slnx --filter ResetConnectionAsync_LeaveOpenTrue_DoesNotDisposeInjectedStream -v minimal`
|
||||
Expected: FAIL because reset currently disposes caller-owned resources
|
||||
|
||||
**Step 3: Write minimal implementation**
|
||||
|
||||
Update `ResetConnectionAsync` to respect the same ownership rule as `DisposeAsync`:
|
||||
|
||||
- if `leaveOpen` is `true`, detach without disposing injected resources
|
||||
- if `leaveOpen` is `false`, dispose detached resources
|
||||
|
||||
Do not broaden interface scope unnecessarily.
|
||||
|
||||
**Step 4: Run test to verify it passes**
|
||||
|
||||
Run: `dotnet test /Users/dohertj2/Desktop/suitelinkclient/SuiteLink.Client.slnx --filter SuiteLinkTcpTransportTests -v minimal`
|
||||
Expected: PASS
|
||||
|
||||
**Step 5: Commit**
|
||||
|
||||
```bash
|
||||
git add /Users/dohertj2/Desktop/suitelinkclient/src/SuiteLink.Client/Transport/SuiteLinkTcpTransport.cs /Users/dohertj2/Desktop/suitelinkclient/src/SuiteLink.Client/Transport/ISuiteLinkReconnectableTransport.cs /Users/dohertj2/Desktop/suitelinkclient/tests/SuiteLink.Client.Tests/Transport/SuiteLinkTcpTransportTests.cs
|
||||
git commit -m "fix: preserve transport ownership during reconnect reset"
|
||||
```
|
||||
|
||||
### Task 6: Add Update Source Metadata
|
||||
|
||||
**Files:**
|
||||
- Create: `/Users/dohertj2/Desktop/suitelinkclient/src/SuiteLink.Client/SuiteLinkUpdateSource.cs`
|
||||
- Modify: `/Users/dohertj2/Desktop/suitelinkclient/src/SuiteLink.Client/SuiteLinkTagUpdate.cs`
|
||||
- Test: `/Users/dohertj2/Desktop/suitelinkclient/tests/SuiteLink.Client.Tests/SuiteLinkValueTests.cs`
|
||||
|
||||
**Step 1: Write the failing test**
|
||||
|
||||
```csharp
|
||||
[Fact]
|
||||
public void TagUpdate_DefaultSource_IsLive()
|
||||
{
|
||||
var update = new SuiteLinkTagUpdate(
|
||||
"Pump001.Run",
|
||||
1,
|
||||
SuiteLinkValue.FromBoolean(true),
|
||||
0x00C0,
|
||||
1,
|
||||
DateTimeOffset.UtcNow);
|
||||
|
||||
Assert.Equal(SuiteLinkUpdateSource.Live, update.Source);
|
||||
}
|
||||
```
|
||||
|
||||
**Step 2: Run test to verify it fails**
|
||||
|
||||
Run: `dotnet test /Users/dohertj2/Desktop/suitelinkclient/SuiteLink.Client.slnx --filter TagUpdate_DefaultSource_IsLive -v minimal`
|
||||
Expected: FAIL because source metadata does not exist
|
||||
|
||||
**Step 3: Write minimal implementation**
|
||||
|
||||
Create:
|
||||
|
||||
```csharp
|
||||
public enum SuiteLinkUpdateSource
|
||||
{
|
||||
Live = 0,
|
||||
CatchUpReplay = 1
|
||||
}
|
||||
```
|
||||
|
||||
Add `Source` to `SuiteLinkTagUpdate` with default:
|
||||
|
||||
```csharp
|
||||
SuiteLinkUpdateSource.Live
|
||||
```
|
||||
|
||||
**Step 4: Run test to verify it passes**
|
||||
|
||||
Run: `dotnet test /Users/dohertj2/Desktop/suitelinkclient/SuiteLink.Client.slnx --filter SuiteLinkTagUpdate -v minimal`
|
||||
Expected: PASS
|
||||
|
||||
**Step 5: Commit**
|
||||
|
||||
```bash
|
||||
git add /Users/dohertj2/Desktop/suitelinkclient/src/SuiteLink.Client/SuiteLinkUpdateSource.cs /Users/dohertj2/Desktop/suitelinkclient/src/SuiteLink.Client/SuiteLinkTagUpdate.cs /Users/dohertj2/Desktop/suitelinkclient/tests/SuiteLink.Client.Tests/SuiteLinkValueTests.cs
|
||||
git commit -m "feat: add update source metadata"
|
||||
```
|
||||
|
||||
### Task 7: Add Best-Effort Catch-Up Refresh Execution
|
||||
|
||||
**Files:**
|
||||
- Modify: `/Users/dohertj2/Desktop/suitelinkclient/src/SuiteLink.Client/SuiteLinkClient.cs`
|
||||
- Modify: `/Users/dohertj2/Desktop/suitelinkclient/src/SuiteLink.Client/Internal/SubscriptionRegistrationEntry.cs`
|
||||
- Modify: `/Users/dohertj2/Desktop/suitelinkclient/tests/SuiteLink.Client.Tests/SuiteLinkClientReconnectTests.cs`
|
||||
|
||||
**Step 1: Write the failing test**
|
||||
|
||||
```csharp
|
||||
[Fact]
|
||||
public async Task Reconnect_WithRefreshLatestValue_CanDispatchCatchUpReplay()
|
||||
{
|
||||
SuiteLinkTagUpdate? catchUp = null;
|
||||
var client = CreateReconnectReplayClient(
|
||||
catchUpPolicy: SuiteLinkCatchUpPolicy.RefreshLatestValue,
|
||||
onUpdate: update =>
|
||||
{
|
||||
if (update.Source == SuiteLinkUpdateSource.CatchUpReplay)
|
||||
{
|
||||
catchUp = update;
|
||||
}
|
||||
});
|
||||
|
||||
await client.ConnectAsync(CreateOptionsWithCatchUp());
|
||||
|
||||
Assert.NotNull(catchUp);
|
||||
Assert.Equal(SuiteLinkUpdateSource.CatchUpReplay, catchUp.Source);
|
||||
}
|
||||
```
|
||||
|
||||
**Step 2: Run test to verify it fails**
|
||||
|
||||
Run: `dotnet test /Users/dohertj2/Desktop/suitelinkclient/SuiteLink.Client.slnx --filter Reconnect_WithRefreshLatestValue_CanDispatchCatchUpReplay -v minimal`
|
||||
Expected: FAIL because reconnect only resumes live dispatch today
|
||||
|
||||
**Step 3: Write minimal implementation**
|
||||
|
||||
After successful reconnect and durable subscription replay:
|
||||
|
||||
- if `Runtime.CatchUpPolicy == SuiteLinkCatchUpPolicy.RefreshLatestValue`
|
||||
- run a sequential refresh pass over durable subscriptions
|
||||
- obtain one fresh value per item using existing temporary-read machinery or a dedicated internal refresh path
|
||||
- dispatch synthetic updates with:
|
||||
|
||||
```csharp
|
||||
Source: SuiteLinkUpdateSource.CatchUpReplay
|
||||
```
|
||||
|
||||
Do not fail reconnect if one item refresh fails or times out.
|
||||
|
||||
**Step 4: Run test to verify it passes**
|
||||
|
||||
Run: `dotnet test /Users/dohertj2/Desktop/suitelinkclient/SuiteLink.Client.slnx --filter SuiteLinkClientReconnectTests -v minimal`
|
||||
Expected: PASS
|
||||
|
||||
**Step 5: Commit**
|
||||
|
||||
```bash
|
||||
git add /Users/dohertj2/Desktop/suitelinkclient/src/SuiteLink.Client/SuiteLinkClient.cs /Users/dohertj2/Desktop/suitelinkclient/src/SuiteLink.Client/Internal/SubscriptionRegistrationEntry.cs /Users/dohertj2/Desktop/suitelinkclient/tests/SuiteLink.Client.Tests/SuiteLinkClientReconnectTests.cs
|
||||
git commit -m "feat: add reconnect catch-up refresh replay"
|
||||
```
|
||||
|
||||
### Task 8: Make Catch-Up Partial Failure Non-Fatal
|
||||
|
||||
**Files:**
|
||||
- Modify: `/Users/dohertj2/Desktop/suitelinkclient/src/SuiteLink.Client/SuiteLinkClient.cs`
|
||||
- Modify: `/Users/dohertj2/Desktop/suitelinkclient/tests/SuiteLink.Client.Tests/SuiteLinkClientReconnectTests.cs`
|
||||
|
||||
**Step 1: Write the failing test**
|
||||
|
||||
```csharp
|
||||
[Fact]
|
||||
public async Task Reconnect_CatchUpTimeout_DoesNotFailRecoveredSubscriptions()
|
||||
{
|
||||
var client = CreateReconnectReplayClientWithTimedOutRefresh();
|
||||
|
||||
await client.ConnectAsync(CreateOptionsWithCatchUp());
|
||||
|
||||
await Eventually.AssertAsync(() => Assert.True(client.IsConnected));
|
||||
}
|
||||
```
|
||||
|
||||
**Step 2: Run test to verify it fails**
|
||||
|
||||
Run: `dotnet test /Users/dohertj2/Desktop/suitelinkclient/SuiteLink.Client.slnx --filter Reconnect_CatchUpTimeout_DoesNotFailRecoveredSubscriptions -v minimal`
|
||||
Expected: FAIL if catch-up failure tears down reconnect
|
||||
|
||||
**Step 3: Write minimal implementation**
|
||||
|
||||
Wrap each refresh item independently:
|
||||
|
||||
- timeout per item from `Runtime.CatchUpTimeout`
|
||||
- swallow per-item failure after optionally recording internal debug signal
|
||||
- continue to remaining items
|
||||
|
||||
Do not change the recovered `Ready`/`Subscribed` state.
|
||||
|
||||
**Step 4: Run test to verify it passes**
|
||||
|
||||
Run: `dotnet test /Users/dohertj2/Desktop/suitelinkclient/SuiteLink.Client.slnx --filter SuiteLinkClientReconnectTests -v minimal`
|
||||
Expected: PASS
|
||||
|
||||
**Step 5: Commit**
|
||||
|
||||
```bash
|
||||
git add /Users/dohertj2/Desktop/suitelinkclient/src/SuiteLink.Client/SuiteLinkClient.cs /Users/dohertj2/Desktop/suitelinkclient/tests/SuiteLink.Client.Tests/SuiteLinkClientReconnectTests.cs
|
||||
git commit -m "feat: tolerate partial catch-up refresh failures"
|
||||
```
|
||||
|
||||
### Task 9: Add Jitter Coverage Without Flaky Tests
|
||||
|
||||
**Files:**
|
||||
- Modify: `/Users/dohertj2/Desktop/suitelinkclient/src/SuiteLink.Client/Internal/SuiteLinkRetryDelayCalculator.cs`
|
||||
- Modify: `/Users/dohertj2/Desktop/suitelinkclient/tests/SuiteLink.Client.Tests/Internal/SuiteLinkRetryDelayCalculatorTests.cs`
|
||||
|
||||
**Step 1: Write the failing test**
|
||||
|
||||
```csharp
|
||||
[Fact]
|
||||
public void GetDelay_WithJitterEnabled_StaysWithinCap()
|
||||
{
|
||||
var policy = new SuiteLinkRetryPolicy(
|
||||
InitialDelay: TimeSpan.FromSeconds(2),
|
||||
Multiplier: 2.0,
|
||||
MaxDelay: TimeSpan.FromSeconds(10),
|
||||
UseJitter: true);
|
||||
|
||||
var delay = SuiteLinkRetryDelayCalculator.GetDelay(policy, 3, () => 0.5);
|
||||
|
||||
Assert.InRange(delay, TimeSpan.Zero, TimeSpan.FromSeconds(10));
|
||||
}
|
||||
```
|
||||
|
||||
**Step 2: Run test to verify it fails**
|
||||
|
||||
Run: `dotnet test /Users/dohertj2/Desktop/suitelinkclient/SuiteLink.Client.slnx --filter SuiteLinkRetryDelayCalculatorTests -v minimal`
|
||||
Expected: FAIL because jitter injection does not exist yet
|
||||
|
||||
**Step 3: Write minimal implementation**
|
||||
|
||||
Add an injected random source overload:
|
||||
|
||||
```csharp
|
||||
public static TimeSpan GetDelay(SuiteLinkRetryPolicy policy, int attempt, Func<double>? nextDouble = null)
|
||||
```
|
||||
|
||||
When jitter is enabled:
|
||||
|
||||
- compute bounded base delay
|
||||
- apply deterministic injected random value in tests
|
||||
- keep final value within `[0, MaxDelay]`
|
||||
|
||||
**Step 4: Run test to verify it passes**
|
||||
|
||||
Run: `dotnet test /Users/dohertj2/Desktop/suitelinkclient/SuiteLink.Client.slnx --filter SuiteLinkRetryDelayCalculatorTests -v minimal`
|
||||
Expected: PASS
|
||||
|
||||
**Step 5: Commit**
|
||||
|
||||
```bash
|
||||
git add /Users/dohertj2/Desktop/suitelinkclient/src/SuiteLink.Client/Internal/SuiteLinkRetryDelayCalculator.cs /Users/dohertj2/Desktop/suitelinkclient/tests/SuiteLink.Client.Tests/Internal/SuiteLinkRetryDelayCalculatorTests.cs
|
||||
git commit -m "feat: add deterministic jitter coverage for retry policy"
|
||||
```
|
||||
|
||||
### Task 10: Update Documentation And Final Verification
|
||||
|
||||
**Files:**
|
||||
- Modify: `/Users/dohertj2/Desktop/suitelinkclient/README.md`
|
||||
- Modify: `/Users/dohertj2/Desktop/suitelinkclient/tests/SuiteLink.Client.IntegrationTests/README.md`
|
||||
- Modify: `/Users/dohertj2/Desktop/suitelinkclient/docs/plans/2026-03-17-catchup-retry-design.md`
|
||||
- Modify: `/Users/dohertj2/Desktop/suitelinkclient/docs/plans/2026-03-17-catchup-retry-implementation-plan.md`
|
||||
|
||||
**Step 1: Write the documentation diff**
|
||||
|
||||
Document:
|
||||
|
||||
- catch-up mode is latest-value refresh only
|
||||
- retry policy is configurable and jittered by default
|
||||
- reconnect success is separate from best-effort catch-up completion
|
||||
- writes still fail during reconnect
|
||||
|
||||
**Step 2: Run targeted verification**
|
||||
|
||||
Run: `rg -n "catch-up|retry|reconnect|jitter|refresh latest|reconnecting" /Users/dohertj2/Desktop/suitelinkclient/README.md /Users/dohertj2/Desktop/suitelinkclient/tests/SuiteLink.Client.IntegrationTests/README.md`
|
||||
Expected: PASS with updated wording
|
||||
|
||||
**Step 3: Run full verification**
|
||||
|
||||
Run: `dotnet test /Users/dohertj2/Desktop/suitelinkclient/SuiteLink.Client.slnx -v minimal`
|
||||
Expected: PASS
|
||||
|
||||
**Step 4: Commit**
|
||||
|
||||
```bash
|
||||
git add /Users/dohertj2/Desktop/suitelinkclient/README.md /Users/dohertj2/Desktop/suitelinkclient/tests/SuiteLink.Client.IntegrationTests/README.md /Users/dohertj2/Desktop/suitelinkclient/docs/plans/2026-03-17-catchup-retry-design.md /Users/dohertj2/Desktop/suitelinkclient/docs/plans/2026-03-17-catchup-retry-implementation-plan.md
|
||||
git commit -m "docs: describe catch-up replay and retry policy"
|
||||
```
|
||||
225
docs/plans/2026-03-17-runtime-reconnect-design.md
Normal file
225
docs/plans/2026-03-17-runtime-reconnect-design.md
Normal file
@@ -0,0 +1,225 @@
|
||||
# SuiteLink Runtime Reconnect Design
|
||||
|
||||
## Goal
|
||||
|
||||
Add a background receive loop and automatic reconnect/recovery to the existing SuiteLink client so subscriptions are restored automatically and update callbacks resume without caller intervention.
|
||||
|
||||
## Scope
|
||||
|
||||
This design adds:
|
||||
|
||||
- a background receive loop owned by `SuiteLinkClient`
|
||||
- automatic reconnect with bounded retry backoff
|
||||
- automatic subscription replay after reconnect
|
||||
- resumed update dispatch after replay
|
||||
|
||||
This design does not add:
|
||||
|
||||
- write queuing during reconnect
|
||||
- catch-up replay of missed values
|
||||
- secure SuiteLink V3/TLS support
|
||||
- AlarmMgr support
|
||||
|
||||
## Runtime Model
|
||||
|
||||
The current client uses explicit on-demand inbound processing. The new model shifts normal operation to a managed runtime loop.
|
||||
|
||||
There are two categories of state:
|
||||
|
||||
- durable desired state
|
||||
- configured connection options
|
||||
- caller subscription intent
|
||||
- callbacks associated with subscribed items
|
||||
- ephemeral connection state
|
||||
- current transport connection
|
||||
- current session state
|
||||
- current `itemName <-> tagId` mappings
|
||||
|
||||
Durable state survives reconnects. Ephemeral state is rebuilt on reconnect.
|
||||
|
||||
## Recommended Approach
|
||||
|
||||
Implement a supervised background receive loop inside `SuiteLinkClient`.
|
||||
|
||||
Behavior:
|
||||
|
||||
1. `ConnectAsync` establishes the initial transport/session and starts the receive loop.
|
||||
2. The receive loop reads frames continuously.
|
||||
3. Update frames are decoded and dispatched to user callbacks.
|
||||
4. EOF, transport exceptions, malformed frames, or replay failures trigger recovery.
|
||||
5. Recovery reconnects with bounded retry delays.
|
||||
6. After reconnect succeeds, the client replays all current subscriptions and resumes dispatching.
|
||||
|
||||
This keeps the public API simple and avoids forcing callers to manually poll `ProcessIncomingAsync`.
|
||||
|
||||
## State Model
|
||||
|
||||
Expand session/client lifecycle to distinguish pending vs ready vs reconnecting:
|
||||
|
||||
- `Disconnected`
|
||||
- `Connecting`
|
||||
- `ConnectSent`
|
||||
- `Ready`
|
||||
- `Reconnecting`
|
||||
- `Faulted`
|
||||
- `Disposed`
|
||||
|
||||
Definitions:
|
||||
|
||||
- `Connecting`: transport connect + handshake in progress
|
||||
- `ConnectSent`: startup connect has been sent but the runtime is not yet considered ready
|
||||
- `Ready`: background receive loop active and subscriptions can be served normally
|
||||
- `Reconnecting`: recovery loop active after a connection failure
|
||||
|
||||
`IsConnected` should reflect `Ready` only.
|
||||
|
||||
## Recovery Policy
|
||||
|
||||
Failure triggers:
|
||||
|
||||
- transport read returns `0`
|
||||
- transport exception while sending or receiving
|
||||
- malformed or unexpected frame during active runtime
|
||||
- reconnect replay failure
|
||||
|
||||
Recovery behavior:
|
||||
|
||||
- stop the current receive loop
|
||||
- mark the ephemeral session as disconnected/faulted
|
||||
- start reconnect attempts until success or explicit shutdown
|
||||
|
||||
Retry schedule:
|
||||
|
||||
- first retry immediately
|
||||
- then bounded retry delays such as:
|
||||
- 1 second
|
||||
- 2 seconds
|
||||
- 5 seconds
|
||||
- 10 seconds
|
||||
- cap the delay instead of growing without bound
|
||||
|
||||
Writes during `Reconnecting` are rejected with a clear exception.
|
||||
|
||||
## Subscription Replay
|
||||
|
||||
The client should maintain a durable subscription registry keyed by `itemName`.
|
||||
|
||||
Each entry stores:
|
||||
|
||||
- `itemName`
|
||||
- callback
|
||||
- requested tag id
|
||||
|
||||
During reconnect:
|
||||
|
||||
1. reconnect transport
|
||||
2. send handshake
|
||||
3. send connect
|
||||
4. replay every subscribed item via `ADVISE`
|
||||
5. rebuild live session mappings from fresh ACKs
|
||||
6. transition to `Ready`
|
||||
|
||||
Subscription replay is serialized and must not run concurrently with normal writes or new replay attempts.
|
||||
|
||||
## Callback Rules
|
||||
|
||||
Callbacks must never run under client locks or gates.
|
||||
|
||||
Rules:
|
||||
|
||||
- decode frames under internal synchronization
|
||||
- dispatch callbacks only after releasing gates
|
||||
- callback exceptions remain contained and do not crash the receive loop
|
||||
|
||||
## Public API Effects
|
||||
|
||||
Expected public behavior:
|
||||
|
||||
- `ConnectAsync`
|
||||
- establishes initial runtime and starts background receive
|
||||
- `SubscribeAsync`
|
||||
- records durable intent
|
||||
- advises immediately when ready
|
||||
- keeps durable subscription for replay after reconnect
|
||||
- `ReadAsync`
|
||||
- can remain implemented as a temporary subscription
|
||||
- should still use the background runtime instead of manual caller polling
|
||||
- `WriteAsync`
|
||||
- allowed only in `Ready`
|
||||
- fails during `Reconnecting`
|
||||
- `DisconnectAsync`
|
||||
- stops receive and reconnect tasks
|
||||
- tears down transport
|
||||
|
||||
`ProcessIncomingAsync` should stop being the primary runtime API. It can be retained only as an internal/test helper if still useful.
|
||||
|
||||
## Internal Changes
|
||||
|
||||
### `SuiteLinkClient`
|
||||
|
||||
Add:
|
||||
|
||||
- receive loop task
|
||||
- reconnect supervisor task or integrated recovery loop
|
||||
- cancellation tokens for runtime shutdown
|
||||
- durable subscription registry
|
||||
- reconnect backoff helper
|
||||
|
||||
Responsibilities:
|
||||
|
||||
- own runtime lifecycle
|
||||
- coordinate reconnect attempts
|
||||
- replay subscriptions safely
|
||||
- ensure only one receive loop and one reconnect flow are active
|
||||
|
||||
### `SuiteLinkSession`
|
||||
|
||||
Continue to manage:
|
||||
|
||||
- live connection/session state
|
||||
- current `itemName <-> tagId` mappings
|
||||
- live dispatch helpers
|
||||
|
||||
Do not make it responsible for durable reconnect intent.
|
||||
|
||||
### `SubscriptionHandle`
|
||||
|
||||
Should continue to remove durable subscription intent and trigger `UNADVISE` when possible.
|
||||
|
||||
If called during reconnect/disconnect, removal of durable intent still succeeds even if wire unadvise cannot be sent.
|
||||
|
||||
## Testing Strategy
|
||||
|
||||
### Runtime Loop Tests
|
||||
|
||||
Add tests proving:
|
||||
|
||||
- updates received by the background loop reach callbacks
|
||||
- no manual `ProcessIncomingAsync` call is needed in normal operation
|
||||
|
||||
### Recovery Tests
|
||||
|
||||
Add tests proving:
|
||||
|
||||
- EOF triggers reconnect
|
||||
- reconnect replays handshake/connect/subscriptions
|
||||
- callback dispatch resumes after reconnect
|
||||
- writes during reconnect fail predictably
|
||||
|
||||
### Lifecycle Tests
|
||||
|
||||
Add tests proving:
|
||||
|
||||
- `DisconnectAsync` stops background tasks
|
||||
- `DisposeAsync` stops reconnect attempts
|
||||
- repeated failures do not start multiple reconnect loops
|
||||
|
||||
## Recommended Next Step
|
||||
|
||||
Create an implementation plan that breaks this into small tasks:
|
||||
|
||||
- durable subscription registry
|
||||
- background receive loop
|
||||
- reconnect loop and backoff
|
||||
- replay logic
|
||||
- runtime tests
|
||||
519
docs/plans/2026-03-17-runtime-reconnect-implementation-plan.md
Normal file
519
docs/plans/2026-03-17-runtime-reconnect-implementation-plan.md
Normal file
@@ -0,0 +1,519 @@
|
||||
# SuiteLink Runtime Reconnect Implementation Plan
|
||||
|
||||
> **For Claude:** REQUIRED SUB-SKILL: Use superpowers:executing-plans to implement this plan task-by-task.
|
||||
|
||||
**Goal:** Add a background receive loop with automatic reconnect and subscription replay so the client continues dispatching updates after transport/session failures.
|
||||
|
||||
**Architecture:** The implementation extends `SuiteLinkClient` with a supervised runtime loop and reconnect flow while keeping durable subscription intent separate from ephemeral session mappings. Recovery rebuilds transport/session state, replays subscriptions, and resumes update dispatch without caller polling.
|
||||
|
||||
**Tech Stack:** .NET 10, C#, xUnit, `SemaphoreSlim`, `CancellationTokenSource`, existing SuiteLink codec/session/transport layers
|
||||
|
||||
---
|
||||
|
||||
### Task 1: Add Durable Subscription Registry
|
||||
|
||||
**Files:**
|
||||
- Create: `/Users/dohertj2/Desktop/suitelinkclient/src/SuiteLink.Client/Internal/SubscriptionRegistrationEntry.cs`
|
||||
- Modify: `/Users/dohertj2/Desktop/suitelinkclient/src/SuiteLink.Client/SuiteLinkClient.cs`
|
||||
- Test: `/Users/dohertj2/Desktop/suitelinkclient/tests/SuiteLink.Client.Tests/SuiteLinkClientSubscriptionRegistryTests.cs`
|
||||
|
||||
**Step 1: Write the failing test**
|
||||
|
||||
```csharp
|
||||
[Fact]
|
||||
public async Task SubscribeAsync_StoresDurableSubscriptionIntent()
|
||||
{
|
||||
var client = TestClientFactory.CreateReadyClient();
|
||||
|
||||
await client.SubscribeAsync("Pump001.Run", _ => { });
|
||||
|
||||
Assert.True(client.HasSubscription("Pump001.Run"));
|
||||
}
|
||||
```
|
||||
|
||||
**Step 2: Run test to verify it fails**
|
||||
|
||||
Run: `dotnet test /Users/dohertj2/Desktop/suitelinkclient/SuiteLink.Client.slnx --filter SuiteLinkClientSubscriptionRegistryTests -v minimal`
|
||||
Expected: FAIL with missing durable registry behavior
|
||||
|
||||
**Step 3: Write minimal implementation**
|
||||
|
||||
Add a durable registry entry model storing:
|
||||
|
||||
- `ItemName`
|
||||
- callback
|
||||
- requested tag id
|
||||
|
||||
Store these entries in `SuiteLinkClient` separately from `SuiteLinkSession`.
|
||||
|
||||
**Step 4: Run test to verify it passes**
|
||||
|
||||
Run: `dotnet test /Users/dohertj2/Desktop/suitelinkclient/SuiteLink.Client.slnx --filter SuiteLinkClientSubscriptionRegistryTests -v minimal`
|
||||
Expected: PASS
|
||||
|
||||
**Step 5: Commit**
|
||||
|
||||
```bash
|
||||
git add /Users/dohertj2/Desktop/suitelinkclient/src/SuiteLink.Client/Internal/SubscriptionRegistrationEntry.cs /Users/dohertj2/Desktop/suitelinkclient/src/SuiteLink.Client/SuiteLinkClient.cs /Users/dohertj2/Desktop/suitelinkclient/tests/SuiteLink.Client.Tests/SuiteLinkClientSubscriptionRegistryTests.cs
|
||||
git commit -m "feat: add durable subscription registry"
|
||||
```
|
||||
|
||||
### Task 2: Make Subscription Handles Remove Durable Intent
|
||||
|
||||
**Files:**
|
||||
- Modify: `/Users/dohertj2/Desktop/suitelinkclient/src/SuiteLink.Client/SuiteLinkClient.cs`
|
||||
- Modify: `/Users/dohertj2/Desktop/suitelinkclient/src/SuiteLink.Client/SubscriptionHandle.cs`
|
||||
- Test: `/Users/dohertj2/Desktop/suitelinkclient/tests/SuiteLink.Client.Tests/SuiteLinkClientSubscriptionRegistryTests.cs`
|
||||
|
||||
**Step 1: Write the failing test**
|
||||
|
||||
```csharp
|
||||
[Fact]
|
||||
public async Task DisposingSubscription_RemovesDurableSubscriptionIntent()
|
||||
{
|
||||
var client = TestClientFactory.CreateReadyClient();
|
||||
var handle = await client.SubscribeAsync("Pump001.Run", _ => { });
|
||||
|
||||
await handle.DisposeAsync();
|
||||
|
||||
Assert.False(client.HasSubscription("Pump001.Run"));
|
||||
}
|
||||
```
|
||||
|
||||
**Step 2: Run test to verify it fails**
|
||||
|
||||
Run: `dotnet test /Users/dohertj2/Desktop/suitelinkclient/SuiteLink.Client.slnx --filter DisposingSubscription_RemovesDurableSubscriptionIntent -v minimal`
|
||||
Expected: FAIL
|
||||
|
||||
**Step 3: Write minimal implementation**
|
||||
|
||||
Ensure handle disposal removes durable registry entries even when wire unadvise cannot be sent.
|
||||
|
||||
**Step 4: Run test to verify it passes**
|
||||
|
||||
Run: `dotnet test /Users/dohertj2/Desktop/suitelinkclient/SuiteLink.Client.slnx --filter SuiteLinkClientSubscriptionRegistryTests -v minimal`
|
||||
Expected: PASS
|
||||
|
||||
**Step 5: Commit**
|
||||
|
||||
```bash
|
||||
git add /Users/dohertj2/Desktop/suitelinkclient/src/SuiteLink.Client/SuiteLinkClient.cs /Users/dohertj2/Desktop/suitelinkclient/src/SuiteLink.Client/SubscriptionHandle.cs /Users/dohertj2/Desktop/suitelinkclient/tests/SuiteLink.Client.Tests/SuiteLinkClientSubscriptionRegistryTests.cs
|
||||
git commit -m "feat: persist subscription intent across reconnects"
|
||||
```
|
||||
|
||||
### Task 3: Add Runtime State For Background Loop
|
||||
|
||||
**Files:**
|
||||
- Modify: `/Users/dohertj2/Desktop/suitelinkclient/src/SuiteLink.Client/Internal/SuiteLinkSessionState.cs`
|
||||
- Modify: `/Users/dohertj2/Desktop/suitelinkclient/src/SuiteLink.Client/SuiteLinkClient.cs`
|
||||
- Test: `/Users/dohertj2/Desktop/suitelinkclient/tests/SuiteLink.Client.Tests/SuiteLinkClientConnectionTests.cs`
|
||||
|
||||
**Step 1: Write the failing test**
|
||||
|
||||
```csharp
|
||||
[Fact]
|
||||
public async Task ConnectAsync_TransitionsToReadyOnlyAfterRuntimeStarts()
|
||||
{
|
||||
var client = TestClientFactory.CreateReadyHandshakeClient();
|
||||
|
||||
await client.ConnectAsync(TestOptions.Create());
|
||||
|
||||
Assert.True(client.IsConnected);
|
||||
}
|
||||
```
|
||||
|
||||
**Step 2: Run test to verify it fails**
|
||||
|
||||
Run: `dotnet test /Users/dohertj2/Desktop/suitelinkclient/SuiteLink.Client.slnx --filter ConnectAsync_TransitionsToReadyOnlyAfterRuntimeStarts -v minimal`
|
||||
Expected: FAIL with missing ready/runtime state
|
||||
|
||||
**Step 3: Write minimal implementation**
|
||||
|
||||
Add:
|
||||
|
||||
- `Ready`
|
||||
- `Reconnecting`
|
||||
|
||||
and transition `ConnectAsync` into `Ready` when the runtime loop has been established.
|
||||
|
||||
**Step 4: Run test to verify it passes**
|
||||
|
||||
Run: `dotnet test /Users/dohertj2/Desktop/suitelinkclient/SuiteLink.Client.slnx --filter SuiteLinkClientConnectionTests -v minimal`
|
||||
Expected: PASS
|
||||
|
||||
**Step 5: Commit**
|
||||
|
||||
```bash
|
||||
git add /Users/dohertj2/Desktop/suitelinkclient/src/SuiteLink.Client/Internal/SuiteLinkSessionState.cs /Users/dohertj2/Desktop/suitelinkclient/src/SuiteLink.Client/SuiteLinkClient.cs /Users/dohertj2/Desktop/suitelinkclient/tests/SuiteLink.Client.Tests/SuiteLinkClientConnectionTests.cs
|
||||
git commit -m "feat: add ready and reconnecting runtime states"
|
||||
```
|
||||
|
||||
### Task 4: Start Background Receive Loop
|
||||
|
||||
**Files:**
|
||||
- Modify: `/Users/dohertj2/Desktop/suitelinkclient/src/SuiteLink.Client/SuiteLinkClient.cs`
|
||||
- Test: `/Users/dohertj2/Desktop/suitelinkclient/tests/SuiteLink.Client.Tests/SuiteLinkClientRuntimeLoopTests.cs`
|
||||
|
||||
**Step 1: Write the failing test**
|
||||
|
||||
```csharp
|
||||
[Fact]
|
||||
public async Task ConnectAsync_StartsBackgroundLoop_AndDispatchesUpdateWithoutManualPolling()
|
||||
{
|
||||
var updateReceived = new TaskCompletionSource<SuiteLinkTagUpdate>();
|
||||
var client = TestClientFactory.CreateClientWithQueuedUpdate(updateReceived);
|
||||
|
||||
await client.ConnectAsync(TestOptions.Create());
|
||||
await client.SubscribeAsync("Pump001.Run", update => updateReceived.TrySetResult(update));
|
||||
|
||||
var update = await updateReceived.Task.WaitAsync(TimeSpan.FromSeconds(1));
|
||||
Assert.True(update.Value.TryGetBoolean(out var value) && value);
|
||||
}
|
||||
```
|
||||
|
||||
**Step 2: Run test to verify it fails**
|
||||
|
||||
Run: `dotnet test /Users/dohertj2/Desktop/suitelinkclient/SuiteLink.Client.slnx --filter SuiteLinkClientRuntimeLoopTests -v minimal`
|
||||
Expected: FAIL because manual processing is still required
|
||||
|
||||
**Step 3: Write minimal implementation**
|
||||
|
||||
Start a long-lived receive loop task after initial connect, and dispatch updates through existing session logic.
|
||||
|
||||
**Step 4: Run test to verify it passes**
|
||||
|
||||
Run: `dotnet test /Users/dohertj2/Desktop/suitelinkclient/SuiteLink.Client.slnx --filter SuiteLinkClientRuntimeLoopTests -v minimal`
|
||||
Expected: PASS
|
||||
|
||||
**Step 5: Commit**
|
||||
|
||||
```bash
|
||||
git add /Users/dohertj2/Desktop/suitelinkclient/src/SuiteLink.Client/SuiteLinkClient.cs /Users/dohertj2/Desktop/suitelinkclient/tests/SuiteLink.Client.Tests/SuiteLinkClientRuntimeLoopTests.cs
|
||||
git commit -m "feat: add suitelink background receive loop"
|
||||
```
|
||||
|
||||
### Task 5: Make ProcessIncomingAsync Internal Or Non-Primary
|
||||
|
||||
**Files:**
|
||||
- Modify: `/Users/dohertj2/Desktop/suitelinkclient/src/SuiteLink.Client/SuiteLinkClient.cs`
|
||||
- Modify: `/Users/dohertj2/Desktop/suitelinkclient/README.md`
|
||||
- Test: `/Users/dohertj2/Desktop/suitelinkclient/tests/SuiteLink.Client.Tests/SuiteLinkClientRuntimeLoopTests.cs`
|
||||
|
||||
**Step 1: Write the failing documentation/runtime check**
|
||||
|
||||
Define the intended runtime contract:
|
||||
|
||||
- normal operation uses background receive
|
||||
- manual polling is not required for normal subscriptions
|
||||
|
||||
**Step 2: Run targeted tests**
|
||||
|
||||
Run: `dotnet test /Users/dohertj2/Desktop/suitelinkclient/SuiteLink.Client.slnx --filter SuiteLinkClientRuntimeLoopTests -v minimal`
|
||||
Expected: PASS after Task 4
|
||||
|
||||
**Step 3: Write minimal implementation**
|
||||
|
||||
Keep `ProcessIncomingAsync` only as an internal/test helper or document it as non-primary API.
|
||||
|
||||
**Step 4: Run test and docs verification**
|
||||
|
||||
Run: `rg -n "background receive|manual polling|ProcessIncomingAsync" /Users/dohertj2/Desktop/suitelinkclient/README.md`
|
||||
Expected: PASS with updated wording
|
||||
|
||||
**Step 5: Commit**
|
||||
|
||||
```bash
|
||||
git add /Users/dohertj2/Desktop/suitelinkclient/src/SuiteLink.Client/SuiteLinkClient.cs /Users/dohertj2/Desktop/suitelinkclient/README.md
|
||||
git commit -m "docs: describe background runtime model"
|
||||
```
|
||||
|
||||
### Task 6: Detect EOF And Trigger Reconnect
|
||||
|
||||
**Files:**
|
||||
- Modify: `/Users/dohertj2/Desktop/suitelinkclient/src/SuiteLink.Client/SuiteLinkClient.cs`
|
||||
- Test: `/Users/dohertj2/Desktop/suitelinkclient/tests/SuiteLink.Client.Tests/SuiteLinkClientReconnectTests.cs`
|
||||
|
||||
**Step 1: Write the failing test**
|
||||
|
||||
```csharp
|
||||
[Fact]
|
||||
public async Task ReceiveLoop_Eof_TransitionsToReconnecting()
|
||||
{
|
||||
var client = TestClientFactory.CreateClientThatEofsAfterConnect();
|
||||
|
||||
await client.ConnectAsync(TestOptions.Create());
|
||||
|
||||
await Eventually.AssertAsync(() => Assert.Equal(SuiteLinkSessionState.Reconnecting, client.DebugState));
|
||||
}
|
||||
```
|
||||
|
||||
**Step 2: Run test to verify it fails**
|
||||
|
||||
Run: `dotnet test /Users/dohertj2/Desktop/suitelinkclient/SuiteLink.Client.slnx --filter SuiteLinkClientReconnectTests -v minimal`
|
||||
Expected: FAIL
|
||||
|
||||
**Step 3: Write minimal implementation**
|
||||
|
||||
Treat `ReceiveAsync == 0` as a disconnect trigger and start recovery.
|
||||
|
||||
**Step 4: Run test to verify it passes**
|
||||
|
||||
Run: `dotnet test /Users/dohertj2/Desktop/suitelinkclient/SuiteLink.Client.slnx --filter SuiteLinkClientReconnectTests -v minimal`
|
||||
Expected: PASS
|
||||
|
||||
**Step 5: Commit**
|
||||
|
||||
```bash
|
||||
git add /Users/dohertj2/Desktop/suitelinkclient/src/SuiteLink.Client/SuiteLinkClient.cs /Users/dohertj2/Desktop/suitelinkclient/tests/SuiteLink.Client.Tests/SuiteLinkClientReconnectTests.cs
|
||||
git commit -m "feat: detect disconnects and enter reconnect state"
|
||||
```
|
||||
|
||||
### Task 7: Add Bounded Reconnect Backoff
|
||||
|
||||
**Files:**
|
||||
- Modify: `/Users/dohertj2/Desktop/suitelinkclient/src/SuiteLink.Client/SuiteLinkClient.cs`
|
||||
- Test: `/Users/dohertj2/Desktop/suitelinkclient/tests/SuiteLink.Client.Tests/SuiteLinkClientReconnectTests.cs`
|
||||
|
||||
**Step 1: Write the failing test**
|
||||
|
||||
```csharp
|
||||
[Fact]
|
||||
public async Task Reconnect_UsesBoundedRetrySchedule()
|
||||
{
|
||||
var delays = new List<TimeSpan>();
|
||||
var client = TestClientFactory.CreateReconnectTestClient(delays);
|
||||
|
||||
await client.ConnectAsync(TestOptions.Create());
|
||||
|
||||
Assert.Contains(TimeSpan.FromSeconds(1), delays);
|
||||
}
|
||||
```
|
||||
|
||||
**Step 2: Run test to verify it fails**
|
||||
|
||||
Run: `dotnet test /Users/dohertj2/Desktop/suitelinkclient/SuiteLink.Client.slnx --filter Reconnect_UsesBoundedRetrySchedule -v minimal`
|
||||
Expected: FAIL
|
||||
|
||||
**Step 3: Write minimal implementation**
|
||||
|
||||
Add a small capped delay schedule:
|
||||
|
||||
- 0s
|
||||
- 1s
|
||||
- 2s
|
||||
- 5s
|
||||
- 10s capped
|
||||
|
||||
Inject delay behavior for tests if needed.
|
||||
|
||||
**Step 4: Run test to verify it passes**
|
||||
|
||||
Run: `dotnet test /Users/dohertj2/Desktop/suitelinkclient/SuiteLink.Client.slnx --filter SuiteLinkClientReconnectTests -v minimal`
|
||||
Expected: PASS
|
||||
|
||||
**Step 5: Commit**
|
||||
|
||||
```bash
|
||||
git add /Users/dohertj2/Desktop/suitelinkclient/src/SuiteLink.Client/SuiteLinkClient.cs /Users/dohertj2/Desktop/suitelinkclient/tests/SuiteLink.Client.Tests/SuiteLinkClientReconnectTests.cs
|
||||
git commit -m "feat: add bounded reconnect backoff"
|
||||
```
|
||||
|
||||
### Task 8: Replay Subscriptions After Reconnect
|
||||
|
||||
**Files:**
|
||||
- Modify: `/Users/dohertj2/Desktop/suitelinkclient/src/SuiteLink.Client/SuiteLinkClient.cs`
|
||||
- Modify: `/Users/dohertj2/Desktop/suitelinkclient/src/SuiteLink.Client/Internal/SuiteLinkSession.cs`
|
||||
- Test: `/Users/dohertj2/Desktop/suitelinkclient/tests/SuiteLink.Client.Tests/SuiteLinkClientReconnectTests.cs`
|
||||
|
||||
**Step 1: Write the failing test**
|
||||
|
||||
```csharp
|
||||
[Fact]
|
||||
public async Task Reconnect_ReplaysSubscriptions_AndRestoresDispatch()
|
||||
{
|
||||
var callbackCount = 0;
|
||||
var client = TestClientFactory.CreateReconnectReplayClient(() => callbackCount++);
|
||||
|
||||
await client.ConnectAsync(TestOptions.Create());
|
||||
await client.SubscribeAsync("Pump001.Run", _ => callbackCount++);
|
||||
|
||||
await client.WaitForReconnectReadyAsync();
|
||||
Assert.True(callbackCount > 0);
|
||||
}
|
||||
```
|
||||
|
||||
**Step 2: Run test to verify it fails**
|
||||
|
||||
Run: `dotnet test /Users/dohertj2/Desktop/suitelinkclient/SuiteLink.Client.slnx --filter Reconnect_ReplaysSubscriptions_AndRestoresDispatch -v minimal`
|
||||
Expected: FAIL
|
||||
|
||||
**Step 3: Write minimal implementation**
|
||||
|
||||
On successful reconnect:
|
||||
|
||||
- reset live session mappings
|
||||
- replay all durable subscriptions one-by-one
|
||||
- rebuild tag mappings from fresh ACKs
|
||||
- return to `Ready`
|
||||
|
||||
**Step 4: Run test to verify it passes**
|
||||
|
||||
Run: `dotnet test /Users/dohertj2/Desktop/suitelinkclient/SuiteLink.Client.slnx --filter SuiteLinkClientReconnectTests -v minimal`
|
||||
Expected: PASS
|
||||
|
||||
**Step 5: Commit**
|
||||
|
||||
```bash
|
||||
git add /Users/dohertj2/Desktop/suitelinkclient/src/SuiteLink.Client/SuiteLinkClient.cs /Users/dohertj2/Desktop/suitelinkclient/src/SuiteLink.Client/Internal/SuiteLinkSession.cs /Users/dohertj2/Desktop/suitelinkclient/tests/SuiteLink.Client.Tests/SuiteLinkClientReconnectTests.cs
|
||||
git commit -m "feat: replay subscriptions after reconnect"
|
||||
```
|
||||
|
||||
### Task 9: Reject Writes During Reconnect
|
||||
|
||||
**Files:**
|
||||
- Modify: `/Users/dohertj2/Desktop/suitelinkclient/src/SuiteLink.Client/SuiteLinkClient.cs`
|
||||
- Test: `/Users/dohertj2/Desktop/suitelinkclient/tests/SuiteLink.Client.Tests/SuiteLinkClientWriteTests.cs`
|
||||
|
||||
**Step 1: Write the failing test**
|
||||
|
||||
```csharp
|
||||
[Fact]
|
||||
public async Task WriteAsync_DuringReconnect_ThrowsClearException()
|
||||
{
|
||||
var client = TestClientFactory.CreateReconnectingClient();
|
||||
|
||||
await Assert.ThrowsAsync<InvalidOperationException>(() =>
|
||||
client.WriteAsync("Pump001.Run", SuiteLinkValue.FromBoolean(true)));
|
||||
}
|
||||
```
|
||||
|
||||
**Step 2: Run test to verify it fails**
|
||||
|
||||
Run: `dotnet test /Users/dohertj2/Desktop/suitelinkclient/SuiteLink.Client.slnx --filter WriteAsync_DuringReconnect_ThrowsClearException -v minimal`
|
||||
Expected: FAIL
|
||||
|
||||
**Step 3: Write minimal implementation**
|
||||
|
||||
Guard `WriteAsync` so it succeeds only in `Ready`.
|
||||
|
||||
**Step 4: Run test to verify it passes**
|
||||
|
||||
Run: `dotnet test /Users/dohertj2/Desktop/suitelinkclient/SuiteLink.Client.slnx --filter SuiteLinkClientWriteTests -v minimal`
|
||||
Expected: PASS
|
||||
|
||||
**Step 5: Commit**
|
||||
|
||||
```bash
|
||||
git add /Users/dohertj2/Desktop/suitelinkclient/src/SuiteLink.Client/SuiteLinkClient.cs /Users/dohertj2/Desktop/suitelinkclient/tests/SuiteLink.Client.Tests/SuiteLinkClientWriteTests.cs
|
||||
git commit -m "feat: reject writes while reconnecting"
|
||||
```
|
||||
|
||||
### Task 10: Stop Runtime Cleanly On Disconnect
|
||||
|
||||
**Files:**
|
||||
- Modify: `/Users/dohertj2/Desktop/suitelinkclient/src/SuiteLink.Client/SuiteLinkClient.cs`
|
||||
- Test: `/Users/dohertj2/Desktop/suitelinkclient/tests/SuiteLink.Client.Tests/SuiteLinkClientConnectionTests.cs`
|
||||
|
||||
**Step 1: Write the failing test**
|
||||
|
||||
```csharp
|
||||
[Fact]
|
||||
public async Task DisconnectAsync_StopsReceiveAndReconnectLoops()
|
||||
{
|
||||
var client = TestClientFactory.CreateRunningClient();
|
||||
|
||||
await client.ConnectAsync(TestOptions.Create());
|
||||
await client.DisconnectAsync();
|
||||
|
||||
Assert.False(client.IsConnected);
|
||||
}
|
||||
```
|
||||
|
||||
**Step 2: Run test to verify it fails**
|
||||
|
||||
Run: `dotnet test /Users/dohertj2/Desktop/suitelinkclient/SuiteLink.Client.slnx --filter DisconnectAsync_StopsReceiveAndReconnectLoops -v minimal`
|
||||
Expected: FAIL
|
||||
|
||||
**Step 3: Write minimal implementation**
|
||||
|
||||
Cancel runtime loop tokens and stop reconnect attempts on disconnect/dispose.
|
||||
|
||||
**Step 4: Run test to verify it passes**
|
||||
|
||||
Run: `dotnet test /Users/dohertj2/Desktop/suitelinkclient/SuiteLink.Client.slnx --filter SuiteLinkClientConnectionTests -v minimal`
|
||||
Expected: PASS
|
||||
|
||||
**Step 5: Commit**
|
||||
|
||||
```bash
|
||||
git add /Users/dohertj2/Desktop/suitelinkclient/src/SuiteLink.Client/SuiteLinkClient.cs /Users/dohertj2/Desktop/suitelinkclient/tests/SuiteLink.Client.Tests/SuiteLinkClientConnectionTests.cs
|
||||
git commit -m "feat: stop runtime loops on disconnect"
|
||||
```
|
||||
|
||||
### Task 11: Update README And Integration Docs
|
||||
|
||||
**Files:**
|
||||
- Modify: `/Users/dohertj2/Desktop/suitelinkclient/README.md`
|
||||
- Modify: `/Users/dohertj2/Desktop/suitelinkclient/tests/SuiteLink.Client.IntegrationTests/README.md`
|
||||
|
||||
**Step 1: Write the failing documentation check**
|
||||
|
||||
Define required README terms:
|
||||
|
||||
- background receive loop
|
||||
- automatic reconnect
|
||||
- subscription replay
|
||||
- writes rejected during reconnect
|
||||
|
||||
**Step 2: Run documentation review**
|
||||
|
||||
Run: `rg -n "background receive|automatic reconnect|subscription replay|reconnecting" /Users/dohertj2/Desktop/suitelinkclient/README.md /Users/dohertj2/Desktop/suitelinkclient/tests/SuiteLink.Client.IntegrationTests/README.md`
|
||||
Expected: FAIL until docs are updated
|
||||
|
||||
**Step 3: Write minimal implementation**
|
||||
|
||||
Update docs to describe the runtime model and recovery behavior honestly.
|
||||
|
||||
**Step 4: Run documentation review**
|
||||
|
||||
Run: `rg -n "background receive|automatic reconnect|subscription replay|reconnecting" /Users/dohertj2/Desktop/suitelinkclient/README.md /Users/dohertj2/Desktop/suitelinkclient/tests/SuiteLink.Client.IntegrationTests/README.md`
|
||||
Expected: PASS
|
||||
|
||||
**Step 5: Commit**
|
||||
|
||||
```bash
|
||||
git add /Users/dohertj2/Desktop/suitelinkclient/README.md /Users/dohertj2/Desktop/suitelinkclient/tests/SuiteLink.Client.IntegrationTests/README.md
|
||||
git commit -m "docs: describe runtime reconnect behavior"
|
||||
```
|
||||
|
||||
### Task 12: Full Verification Pass
|
||||
|
||||
**Files:**
|
||||
- Modify: `/Users/dohertj2/Desktop/suitelinkclient/docs/plans/2026-03-17-runtime-reconnect-design.md`
|
||||
- Modify: `/Users/dohertj2/Desktop/suitelinkclient/docs/plans/2026-03-17-runtime-reconnect-implementation-plan.md`
|
||||
|
||||
**Step 1: Run full test suite**
|
||||
|
||||
Run: `dotnet test /Users/dohertj2/Desktop/suitelinkclient/SuiteLink.Client.slnx -v minimal`
|
||||
Expected: PASS with integration harness still conditional by default
|
||||
|
||||
**Step 2: Run release build**
|
||||
|
||||
Run: `dotnet build /Users/dohertj2/Desktop/suitelinkclient/SuiteLink.Client.slnx -c Release`
|
||||
Expected: PASS
|
||||
|
||||
**Step 3: Run reconnect-focused tests**
|
||||
|
||||
Run: `dotnet test /Users/dohertj2/Desktop/suitelinkclient/SuiteLink.Client.slnx --filter Reconnect -v minimal`
|
||||
Expected: PASS
|
||||
|
||||
**Step 4: Update plan notes if implementation deviated**
|
||||
|
||||
Add short notes to the design/plan docs if final runtime behavior differs from original assumptions.
|
||||
|
||||
**Step 5: Commit**
|
||||
|
||||
```bash
|
||||
git add /Users/dohertj2/Desktop/suitelinkclient/docs/plans/2026-03-17-runtime-reconnect-design.md /Users/dohertj2/Desktop/suitelinkclient/docs/plans/2026-03-17-runtime-reconnect-implementation-plan.md
|
||||
git commit -m "docs: finalize reconnect implementation verification"
|
||||
```
|
||||
Reference in New Issue
Block a user