Files
suitelinkclient/docs/plans/2026-03-17-catchup-retry-implementation-plan.md
2026-03-17 11:04:19 -04:00

584 lines
20 KiB
Markdown

# Catch-Up Replay And Advanced Retry Implementation Plan
> **For Codex:** REQUIRED SUB-SKILL: Use `executeplan` to implement this plan task-by-task.
**Goal:** Add best-effort latest-value catch-up after reconnect and replace the fixed reconnect delay schedule with a production-grade retry policy, while also fixing the current reconnect quality issues.
**Architecture:** Extend the existing reconnect runtime with a small runtime-options layer, a retry-policy calculator, and a post-reconnect catch-up refresh phase. Keep reconnect success defined as restored live subscriptions, and treat catch-up as a best-effort follow-on phase that emits synthetic updates marked separately from live traffic.
**Tech Stack:** .NET 10, C#, xUnit, existing SuiteLink protocol/client/runtime/transport layers
---
### Task 1: Add Runtime Option Types
**Files:**
- Create: `/Users/dohertj2/Desktop/suitelinkclient/src/SuiteLink.Client/SuiteLinkRuntimeOptions.cs`
- Create: `/Users/dohertj2/Desktop/suitelinkclient/src/SuiteLink.Client/SuiteLinkRetryPolicy.cs`
- Create: `/Users/dohertj2/Desktop/suitelinkclient/src/SuiteLink.Client/SuiteLinkCatchUpPolicy.cs`
- Modify: `/Users/dohertj2/Desktop/suitelinkclient/src/SuiteLink.Client/SuiteLinkConnectionOptions.cs`
- Test: `/Users/dohertj2/Desktop/suitelinkclient/tests/SuiteLink.Client.Tests/SuiteLinkConnectionOptionsTests.cs`
**Step 1: Write the failing test**
```csharp
[Fact]
public void ConnectionOptions_DefaultsRuntimeOptions()
{
var options = new SuiteLinkConnectionOptions(
host: "127.0.0.1",
application: "App",
topic: "Topic",
clientName: "Client",
clientNode: "Node",
userName: "User",
serverNode: "Server");
Assert.NotNull(options.Runtime);
Assert.Equal(SuiteLinkCatchUpPolicy.None, options.Runtime.CatchUpPolicy);
Assert.NotNull(options.Runtime.RetryPolicy);
}
```
**Step 2: Run test to verify it fails**
Run: `dotnet test /Users/dohertj2/Desktop/suitelinkclient/SuiteLink.Client.slnx --filter ConnectionOptions_DefaultsRuntimeOptions -v minimal`
Expected: FAIL because runtime options do not exist yet
**Step 3: Write minimal implementation**
Create:
```csharp
public enum SuiteLinkCatchUpPolicy
{
None = 0,
RefreshLatestValue = 1
}
```
```csharp
public sealed record class SuiteLinkRetryPolicy(
TimeSpan InitialDelay,
double Multiplier,
TimeSpan MaxDelay,
int? MaxAttempts = null,
bool UseJitter = true)
{
public static SuiteLinkRetryPolicy Default { get; } =
new(TimeSpan.FromSeconds(1), 2.0, TimeSpan.FromSeconds(30));
}
```
```csharp
public sealed record class SuiteLinkRuntimeOptions(
SuiteLinkRetryPolicy RetryPolicy,
SuiteLinkCatchUpPolicy CatchUpPolicy,
TimeSpan CatchUpTimeout)
{
public static SuiteLinkRuntimeOptions Default { get; } =
new(SuiteLinkRetryPolicy.Default, SuiteLinkCatchUpPolicy.None, TimeSpan.FromSeconds(2));
}
```
Update `SuiteLinkConnectionOptions` to expose:
```csharp
public SuiteLinkRuntimeOptions Runtime { get; }
```
and default it to `SuiteLinkRuntimeOptions.Default`.
**Step 4: Run test to verify it passes**
Run: `dotnet test /Users/dohertj2/Desktop/suitelinkclient/SuiteLink.Client.slnx --filter SuiteLinkConnectionOptionsTests -v minimal`
Expected: PASS
**Step 5: Commit**
```bash
git add /Users/dohertj2/Desktop/suitelinkclient/src/SuiteLink.Client/SuiteLinkRuntimeOptions.cs /Users/dohertj2/Desktop/suitelinkclient/src/SuiteLink.Client/SuiteLinkRetryPolicy.cs /Users/dohertj2/Desktop/suitelinkclient/src/SuiteLink.Client/SuiteLinkCatchUpPolicy.cs /Users/dohertj2/Desktop/suitelinkclient/src/SuiteLink.Client/SuiteLinkConnectionOptions.cs /Users/dohertj2/Desktop/suitelinkclient/tests/SuiteLink.Client.Tests/SuiteLinkConnectionOptionsTests.cs
git commit -m "feat: add runtime reconnect option types"
```
### Task 2: Add Retry Policy Delay Calculator
**Files:**
- Create: `/Users/dohertj2/Desktop/suitelinkclient/src/SuiteLink.Client/Internal/SuiteLinkRetryDelayCalculator.cs`
- Test: `/Users/dohertj2/Desktop/suitelinkclient/tests/SuiteLink.Client.Tests/Internal/SuiteLinkRetryDelayCalculatorTests.cs`
**Step 1: Write the failing test**
```csharp
[Fact]
public void GetDelay_UsesImmediateThenExponentialCap()
{
var policy = new SuiteLinkRetryPolicy(
InitialDelay: TimeSpan.FromSeconds(1),
Multiplier: 2.0,
MaxDelay: TimeSpan.FromSeconds(30),
UseJitter: false);
Assert.Equal(TimeSpan.Zero, SuiteLinkRetryDelayCalculator.GetDelay(policy, 0));
Assert.Equal(TimeSpan.FromSeconds(1), SuiteLinkRetryDelayCalculator.GetDelay(policy, 1));
Assert.Equal(TimeSpan.FromSeconds(2), SuiteLinkRetryDelayCalculator.GetDelay(policy, 2));
Assert.Equal(TimeSpan.FromSeconds(4), SuiteLinkRetryDelayCalculator.GetDelay(policy, 3));
}
```
**Step 2: Run test to verify it fails**
Run: `dotnet test /Users/dohertj2/Desktop/suitelinkclient/SuiteLink.Client.slnx --filter SuiteLinkRetryDelayCalculatorTests -v minimal`
Expected: FAIL because calculator does not exist yet
**Step 3: Write minimal implementation**
Create:
```csharp
internal static class SuiteLinkRetryDelayCalculator
{
public static TimeSpan GetDelay(SuiteLinkRetryPolicy policy, int attempt)
{
if (attempt == 0)
{
return TimeSpan.Zero;
}
var rawSeconds = policy.InitialDelay.TotalSeconds * Math.Pow(policy.Multiplier, attempt - 1);
var bounded = TimeSpan.FromSeconds(Math.Min(rawSeconds, policy.MaxDelay.TotalSeconds));
return bounded;
}
}
```
Do not add jitter yet beyond the policy flag unless tests require it.
**Step 4: Run test to verify it passes**
Run: `dotnet test /Users/dohertj2/Desktop/suitelinkclient/SuiteLink.Client.slnx --filter SuiteLinkRetryDelayCalculatorTests -v minimal`
Expected: PASS
**Step 5: Commit**
```bash
git add /Users/dohertj2/Desktop/suitelinkclient/src/SuiteLink.Client/Internal/SuiteLinkRetryDelayCalculator.cs /Users/dohertj2/Desktop/suitelinkclient/tests/SuiteLink.Client.Tests/Internal/SuiteLinkRetryDelayCalculatorTests.cs
git commit -m "feat: add reconnect retry delay calculator"
```
### Task 3: Wire Retry Policy Into Reconnect Runtime
**Files:**
- Modify: `/Users/dohertj2/Desktop/suitelinkclient/src/SuiteLink.Client/SuiteLinkClient.cs`
- Modify: `/Users/dohertj2/Desktop/suitelinkclient/tests/SuiteLink.Client.Tests/SuiteLinkClientReconnectTests.cs`
**Step 1: Write the failing test**
```csharp
[Fact]
public async Task Reconnect_UsesConfiguredRetryPolicy()
{
var observed = new List<TimeSpan>();
var options = CreateOptions() with
{
Runtime = new SuiteLinkRuntimeOptions(
new SuiteLinkRetryPolicy(TimeSpan.FromSeconds(3), 3.0, TimeSpan.FromSeconds(20), UseJitter: false),
SuiteLinkCatchUpPolicy.None,
TimeSpan.FromSeconds(2))
};
var client = CreateReconnectClient(delayAsync: (delay, _) =>
{
observed.Add(delay);
return Task.CompletedTask;
});
await client.ConnectAsync(options);
await EventuallyReconnectAsync(client);
Assert.Contains(TimeSpan.FromSeconds(3), observed);
}
```
**Step 2: Run test to verify it fails**
Run: `dotnet test /Users/dohertj2/Desktop/suitelinkclient/SuiteLink.Client.slnx --filter Reconnect_UsesConfiguredRetryPolicy -v minimal`
Expected: FAIL because reconnect still uses a fixed schedule
**Step 3: Write minimal implementation**
In `SuiteLinkClient`:
- remove direct use of `ReconnectDelaySchedule`
- read retry policy from `_connectionOptions!.Runtime.RetryPolicy`
- use `SuiteLinkRetryDelayCalculator.GetDelay(policy, attempt)`
Keep the current injected `_delayAsync` test seam.
**Step 4: Run test to verify it passes**
Run: `dotnet test /Users/dohertj2/Desktop/suitelinkclient/SuiteLink.Client.slnx --filter SuiteLinkClientReconnectTests -v minimal`
Expected: PASS
**Step 5: Commit**
```bash
git add /Users/dohertj2/Desktop/suitelinkclient/src/SuiteLink.Client/SuiteLinkClient.cs /Users/dohertj2/Desktop/suitelinkclient/tests/SuiteLink.Client.Tests/SuiteLinkClientReconnectTests.cs
git commit -m "feat: apply retry policy to reconnect runtime"
```
### Task 4: Fix Fast-Fail Writes During Reconnect
**Files:**
- Modify: `/Users/dohertj2/Desktop/suitelinkclient/src/SuiteLink.Client/SuiteLinkClient.cs`
- Modify: `/Users/dohertj2/Desktop/suitelinkclient/tests/SuiteLink.Client.Tests/SuiteLinkClientWriteTests.cs`
**Step 1: Write the failing test**
```csharp
[Fact]
public async Task WriteAsync_DuringReconnect_ThrowsBeforeWaitingOnOperationGate()
{
var client = CreateClientWithBlockedOperationGateAndReconnectState();
var ex = await Assert.ThrowsAsync<InvalidOperationException>(
() => client.WriteAsync("Pump001.Run", SuiteLinkValue.FromBoolean(true)));
Assert.Contains("reconnecting", ex.Message, StringComparison.OrdinalIgnoreCase);
}
```
**Step 2: Run test to verify it fails**
Run: `dotnet test /Users/dohertj2/Desktop/suitelinkclient/SuiteLink.Client.slnx --filter WriteAsync_DuringReconnect_ThrowsBeforeWaitingOnOperationGate -v minimal`
Expected: FAIL because `WriteAsync` currently waits on `_operationGate` first
**Step 3: Write minimal implementation**
Move the reconnect state check ahead of:
```csharp
await _operationGate.WaitAsync(...)
```
while keeping disposed-state checks intact.
**Step 4: Run test to verify it passes**
Run: `dotnet test /Users/dohertj2/Desktop/suitelinkclient/SuiteLink.Client.slnx --filter SuiteLinkClientWriteTests -v minimal`
Expected: PASS
**Step 5: Commit**
```bash
git add /Users/dohertj2/Desktop/suitelinkclient/src/SuiteLink.Client/SuiteLinkClient.cs /Users/dohertj2/Desktop/suitelinkclient/tests/SuiteLink.Client.Tests/SuiteLinkClientWriteTests.cs
git commit -m "fix: fail writes before reconnect gate contention"
```
### Task 5: Fix Transport Reset Ownership Semantics
**Files:**
- Modify: `/Users/dohertj2/Desktop/suitelinkclient/src/SuiteLink.Client/Transport/SuiteLinkTcpTransport.cs`
- Modify: `/Users/dohertj2/Desktop/suitelinkclient/src/SuiteLink.Client/Transport/ISuiteLinkReconnectableTransport.cs`
- Modify: `/Users/dohertj2/Desktop/suitelinkclient/tests/SuiteLink.Client.Tests/Transport/SuiteLinkTcpTransportTests.cs`
**Step 1: Write the failing test**
```csharp
[Fact]
public async Task ResetConnectionAsync_LeaveOpenTrue_DoesNotDisposeInjectedStream()
{
var stream = new TrackingStream();
await using var transport = new SuiteLinkTcpTransport(stream, leaveOpen: true);
await transport.ResetConnectionAsync();
Assert.False(stream.WasDisposed);
}
```
**Step 2: Run test to verify it fails**
Run: `dotnet test /Users/dohertj2/Desktop/suitelinkclient/SuiteLink.Client.slnx --filter ResetConnectionAsync_LeaveOpenTrue_DoesNotDisposeInjectedStream -v minimal`
Expected: FAIL because reset currently disposes caller-owned resources
**Step 3: Write minimal implementation**
Update `ResetConnectionAsync` to respect the same ownership rule as `DisposeAsync`:
- if `leaveOpen` is `true`, detach without disposing injected resources
- if `leaveOpen` is `false`, dispose detached resources
Do not broaden interface scope unnecessarily.
**Step 4: Run test to verify it passes**
Run: `dotnet test /Users/dohertj2/Desktop/suitelinkclient/SuiteLink.Client.slnx --filter SuiteLinkTcpTransportTests -v minimal`
Expected: PASS
**Step 5: Commit**
```bash
git add /Users/dohertj2/Desktop/suitelinkclient/src/SuiteLink.Client/Transport/SuiteLinkTcpTransport.cs /Users/dohertj2/Desktop/suitelinkclient/src/SuiteLink.Client/Transport/ISuiteLinkReconnectableTransport.cs /Users/dohertj2/Desktop/suitelinkclient/tests/SuiteLink.Client.Tests/Transport/SuiteLinkTcpTransportTests.cs
git commit -m "fix: preserve transport ownership during reconnect reset"
```
### Task 6: Add Update Source Metadata
**Files:**
- Create: `/Users/dohertj2/Desktop/suitelinkclient/src/SuiteLink.Client/SuiteLinkUpdateSource.cs`
- Modify: `/Users/dohertj2/Desktop/suitelinkclient/src/SuiteLink.Client/SuiteLinkTagUpdate.cs`
- Test: `/Users/dohertj2/Desktop/suitelinkclient/tests/SuiteLink.Client.Tests/SuiteLinkValueTests.cs`
**Step 1: Write the failing test**
```csharp
[Fact]
public void TagUpdate_DefaultSource_IsLive()
{
var update = new SuiteLinkTagUpdate(
"Pump001.Run",
1,
SuiteLinkValue.FromBoolean(true),
0x00C0,
1,
DateTimeOffset.UtcNow);
Assert.Equal(SuiteLinkUpdateSource.Live, update.Source);
}
```
**Step 2: Run test to verify it fails**
Run: `dotnet test /Users/dohertj2/Desktop/suitelinkclient/SuiteLink.Client.slnx --filter TagUpdate_DefaultSource_IsLive -v minimal`
Expected: FAIL because source metadata does not exist
**Step 3: Write minimal implementation**
Create:
```csharp
public enum SuiteLinkUpdateSource
{
Live = 0,
CatchUpReplay = 1
}
```
Add `Source` to `SuiteLinkTagUpdate` with default:
```csharp
SuiteLinkUpdateSource.Live
```
**Step 4: Run test to verify it passes**
Run: `dotnet test /Users/dohertj2/Desktop/suitelinkclient/SuiteLink.Client.slnx --filter SuiteLinkTagUpdate -v minimal`
Expected: PASS
**Step 5: Commit**
```bash
git add /Users/dohertj2/Desktop/suitelinkclient/src/SuiteLink.Client/SuiteLinkUpdateSource.cs /Users/dohertj2/Desktop/suitelinkclient/src/SuiteLink.Client/SuiteLinkTagUpdate.cs /Users/dohertj2/Desktop/suitelinkclient/tests/SuiteLink.Client.Tests/SuiteLinkValueTests.cs
git commit -m "feat: add update source metadata"
```
### Task 7: Add Best-Effort Catch-Up Refresh Execution
**Files:**
- Modify: `/Users/dohertj2/Desktop/suitelinkclient/src/SuiteLink.Client/SuiteLinkClient.cs`
- Modify: `/Users/dohertj2/Desktop/suitelinkclient/src/SuiteLink.Client/Internal/SubscriptionRegistrationEntry.cs`
- Modify: `/Users/dohertj2/Desktop/suitelinkclient/tests/SuiteLink.Client.Tests/SuiteLinkClientReconnectTests.cs`
**Step 1: Write the failing test**
```csharp
[Fact]
public async Task Reconnect_WithRefreshLatestValue_CanDispatchCatchUpReplay()
{
SuiteLinkTagUpdate? catchUp = null;
var client = CreateReconnectReplayClient(
catchUpPolicy: SuiteLinkCatchUpPolicy.RefreshLatestValue,
onUpdate: update =>
{
if (update.Source == SuiteLinkUpdateSource.CatchUpReplay)
{
catchUp = update;
}
});
await client.ConnectAsync(CreateOptionsWithCatchUp());
Assert.NotNull(catchUp);
Assert.Equal(SuiteLinkUpdateSource.CatchUpReplay, catchUp.Source);
}
```
**Step 2: Run test to verify it fails**
Run: `dotnet test /Users/dohertj2/Desktop/suitelinkclient/SuiteLink.Client.slnx --filter Reconnect_WithRefreshLatestValue_CanDispatchCatchUpReplay -v minimal`
Expected: FAIL because reconnect only resumes live dispatch today
**Step 3: Write minimal implementation**
After successful reconnect and durable subscription replay:
- if `Runtime.CatchUpPolicy == SuiteLinkCatchUpPolicy.RefreshLatestValue`
- run a sequential refresh pass over durable subscriptions
- obtain one fresh value per item using existing temporary-read machinery or a dedicated internal refresh path
- dispatch synthetic updates with:
```csharp
Source: SuiteLinkUpdateSource.CatchUpReplay
```
Do not fail reconnect if one item refresh fails or times out.
**Step 4: Run test to verify it passes**
Run: `dotnet test /Users/dohertj2/Desktop/suitelinkclient/SuiteLink.Client.slnx --filter SuiteLinkClientReconnectTests -v minimal`
Expected: PASS
**Step 5: Commit**
```bash
git add /Users/dohertj2/Desktop/suitelinkclient/src/SuiteLink.Client/SuiteLinkClient.cs /Users/dohertj2/Desktop/suitelinkclient/src/SuiteLink.Client/Internal/SubscriptionRegistrationEntry.cs /Users/dohertj2/Desktop/suitelinkclient/tests/SuiteLink.Client.Tests/SuiteLinkClientReconnectTests.cs
git commit -m "feat: add reconnect catch-up refresh replay"
```
### Task 8: Make Catch-Up Partial Failure Non-Fatal
**Files:**
- Modify: `/Users/dohertj2/Desktop/suitelinkclient/src/SuiteLink.Client/SuiteLinkClient.cs`
- Modify: `/Users/dohertj2/Desktop/suitelinkclient/tests/SuiteLink.Client.Tests/SuiteLinkClientReconnectTests.cs`
**Step 1: Write the failing test**
```csharp
[Fact]
public async Task Reconnect_CatchUpTimeout_DoesNotFailRecoveredSubscriptions()
{
var client = CreateReconnectReplayClientWithTimedOutRefresh();
await client.ConnectAsync(CreateOptionsWithCatchUp());
await Eventually.AssertAsync(() => Assert.True(client.IsConnected));
}
```
**Step 2: Run test to verify it fails**
Run: `dotnet test /Users/dohertj2/Desktop/suitelinkclient/SuiteLink.Client.slnx --filter Reconnect_CatchUpTimeout_DoesNotFailRecoveredSubscriptions -v minimal`
Expected: FAIL if catch-up failure tears down reconnect
**Step 3: Write minimal implementation**
Wrap each refresh item independently:
- timeout per item from `Runtime.CatchUpTimeout`
- swallow per-item failure after optionally recording internal debug signal
- continue to remaining items
Do not change the recovered `Ready`/`Subscribed` state.
**Step 4: Run test to verify it passes**
Run: `dotnet test /Users/dohertj2/Desktop/suitelinkclient/SuiteLink.Client.slnx --filter SuiteLinkClientReconnectTests -v minimal`
Expected: PASS
**Step 5: Commit**
```bash
git add /Users/dohertj2/Desktop/suitelinkclient/src/SuiteLink.Client/SuiteLinkClient.cs /Users/dohertj2/Desktop/suitelinkclient/tests/SuiteLink.Client.Tests/SuiteLinkClientReconnectTests.cs
git commit -m "feat: tolerate partial catch-up refresh failures"
```
### Task 9: Add Jitter Coverage Without Flaky Tests
**Files:**
- Modify: `/Users/dohertj2/Desktop/suitelinkclient/src/SuiteLink.Client/Internal/SuiteLinkRetryDelayCalculator.cs`
- Modify: `/Users/dohertj2/Desktop/suitelinkclient/tests/SuiteLink.Client.Tests/Internal/SuiteLinkRetryDelayCalculatorTests.cs`
**Step 1: Write the failing test**
```csharp
[Fact]
public void GetDelay_WithJitterEnabled_StaysWithinCap()
{
var policy = new SuiteLinkRetryPolicy(
InitialDelay: TimeSpan.FromSeconds(2),
Multiplier: 2.0,
MaxDelay: TimeSpan.FromSeconds(10),
UseJitter: true);
var delay = SuiteLinkRetryDelayCalculator.GetDelay(policy, 3, () => 0.5);
Assert.InRange(delay, TimeSpan.Zero, TimeSpan.FromSeconds(10));
}
```
**Step 2: Run test to verify it fails**
Run: `dotnet test /Users/dohertj2/Desktop/suitelinkclient/SuiteLink.Client.slnx --filter SuiteLinkRetryDelayCalculatorTests -v minimal`
Expected: FAIL because jitter injection does not exist yet
**Step 3: Write minimal implementation**
Add an injected random source overload:
```csharp
public static TimeSpan GetDelay(SuiteLinkRetryPolicy policy, int attempt, Func<double>? nextDouble = null)
```
When jitter is enabled:
- compute bounded base delay
- apply deterministic injected random value in tests
- keep final value within `[0, MaxDelay]`
**Step 4: Run test to verify it passes**
Run: `dotnet test /Users/dohertj2/Desktop/suitelinkclient/SuiteLink.Client.slnx --filter SuiteLinkRetryDelayCalculatorTests -v minimal`
Expected: PASS
**Step 5: Commit**
```bash
git add /Users/dohertj2/Desktop/suitelinkclient/src/SuiteLink.Client/Internal/SuiteLinkRetryDelayCalculator.cs /Users/dohertj2/Desktop/suitelinkclient/tests/SuiteLink.Client.Tests/Internal/SuiteLinkRetryDelayCalculatorTests.cs
git commit -m "feat: add deterministic jitter coverage for retry policy"
```
### Task 10: Update Documentation And Final Verification
**Files:**
- Modify: `/Users/dohertj2/Desktop/suitelinkclient/README.md`
- Modify: `/Users/dohertj2/Desktop/suitelinkclient/tests/SuiteLink.Client.IntegrationTests/README.md`
- Modify: `/Users/dohertj2/Desktop/suitelinkclient/docs/plans/2026-03-17-catchup-retry-design.md`
- Modify: `/Users/dohertj2/Desktop/suitelinkclient/docs/plans/2026-03-17-catchup-retry-implementation-plan.md`
**Step 1: Write the documentation diff**
Document:
- catch-up mode is latest-value refresh only
- retry policy is configurable and jittered by default
- reconnect success is separate from best-effort catch-up completion
- writes still fail during reconnect
**Step 2: Run targeted verification**
Run: `rg -n "catch-up|retry|reconnect|jitter|refresh latest|reconnecting" /Users/dohertj2/Desktop/suitelinkclient/README.md /Users/dohertj2/Desktop/suitelinkclient/tests/SuiteLink.Client.IntegrationTests/README.md`
Expected: PASS with updated wording
**Step 3: Run full verification**
Run: `dotnet test /Users/dohertj2/Desktop/suitelinkclient/SuiteLink.Client.slnx -v minimal`
Expected: PASS
**Step 4: Commit**
```bash
git add /Users/dohertj2/Desktop/suitelinkclient/README.md /Users/dohertj2/Desktop/suitelinkclient/tests/SuiteLink.Client.IntegrationTests/README.md /Users/dohertj2/Desktop/suitelinkclient/docs/plans/2026-03-17-catchup-retry-design.md /Users/dohertj2/Desktop/suitelinkclient/docs/plans/2026-03-17-catchup-retry-implementation-plan.md
git commit -m "docs: describe catch-up replay and retry policy"
```