Files
lmxopcua/src/ZB.MOM.WW.OtOpcUa.Core/Resilience/CapabilityInvoker.cs
Joseph Doherty ae8f226e45 Phase 6.1 Stream E.3 partial — in-flight counter feeds CurrentBulkheadDepth
Closes the observer half of #162 that was flagged as "persisted as 0 today"
in PR #105. The Admin /hosts column refresh + FleetStatusHub SignalR push
+ red-badge visual still belong to the visual-compliance pass.

Core.Resilience:
- DriverResilienceStatusTracker gains RecordCallStart + RecordCallComplete
  + CurrentInFlight field on the snapshot record. Concurrent-safe via the
  same ConcurrentDictionary.AddOrUpdate pattern as the other recorder methods.
  Clamps to zero on over-decrement so a stray Complete-without-Start can't
  drive the counter negative.
- CapabilityInvoker gains an optional statusTracker ctor parameter. When
  wired, every ExecuteAsync / ExecuteAsync(void) wraps the pipeline call
  in try / finally that records start/complete — so the counter advances
  cleanly whether the call succeeds, cancels, or throws. Null tracker keeps
  the pre-Phase-6.1 Stream E.3 behaviour exactly.

Server.Hosting:
- ResilienceStatusPublisherHostedService persists CurrentInFlight as the
  DriverInstanceResilienceStatus.CurrentBulkheadDepth column (was 0 before
  this PR). One-line fix on both the insert + update branches.

The in-flight counter is a pragmatic proxy for Polly's internal bulkhead
depth — a future PR wiring Polly telemetry would replace it with the real
value. The shape of the column + the publisher + the Admin /hosts query
doesn't change, so the follow-up is invisible to consumers.

Tests (8 new InFlightCounterTests, all pass):
- Start+Complete nets to zero.
- Nested starts sum; Complete decrements.
- Complete-without-Start clamps to zero.
- Different hosts track independently.
- Concurrent starts (500 parallel) don't lose count.
- CapabilityInvoker observed-mid-call depth == 1 during a pending call.
- CapabilityInvoker exception path still decrements (try/finally).
- CapabilityInvoker without tracker doesn't throw.

Full solution dotnet test: 1243 passing (was 1235, +8). Pre-existing
Client.CLI Subscribe flake unchanged.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-19 15:02:34 -04:00

141 lines
6.4 KiB
C#

using Polly;
using ZB.MOM.WW.OtOpcUa.Core.Abstractions;
using ZB.MOM.WW.OtOpcUa.Core.Observability;
namespace ZB.MOM.WW.OtOpcUa.Core.Resilience;
/// <summary>
/// Executes driver-capability calls through a shared Polly pipeline. One invoker per
/// <c>(DriverInstance, IDriver)</c> pair; the underlying <see cref="DriverResiliencePipelineBuilder"/>
/// is process-singleton so all invokers share its cache.
/// </summary>
/// <remarks>
/// Per <c>docs/v2/plan.md</c> decisions #143-144 and Phase 6.1 Stream A.3. The server's dispatch
/// layer routes every capability call (<c>IReadable.ReadAsync</c>, <c>IWritable.WriteAsync</c>,
/// <c>ITagDiscovery.DiscoverAsync</c>, <c>ISubscribable.SubscribeAsync/UnsubscribeAsync</c>,
/// <c>IHostConnectivityProbe</c> probe loop, <c>IAlarmSource.SubscribeAlarmsAsync/AcknowledgeAsync</c>,
/// and all four <c>IHistoryProvider</c> reads) through this invoker.
/// </remarks>
public sealed class CapabilityInvoker
{
private readonly DriverResiliencePipelineBuilder _builder;
private readonly string _driverInstanceId;
private readonly string _driverType;
private readonly Func<DriverResilienceOptions> _optionsAccessor;
private readonly DriverResilienceStatusTracker? _statusTracker;
/// <summary>
/// Construct an invoker for one driver instance.
/// </summary>
/// <param name="builder">Shared, process-singleton pipeline builder.</param>
/// <param name="driverInstanceId">The <c>DriverInstance.Id</c> column value.</param>
/// <param name="optionsAccessor">
/// Snapshot accessor for the current resilience options. Invoked per call so Admin-edit +
/// pipeline-invalidate can take effect without restarting the invoker.
/// </param>
/// <param name="driverType">Driver type name for structured-log enrichment (e.g. <c>"Modbus"</c>).</param>
/// <param name="statusTracker">Optional resilience-status tracker. When wired, every capability call records start/complete so Admin <c>/hosts</c> can surface <see cref="ResilienceStatusSnapshot.CurrentInFlight"/> as the bulkhead-depth proxy.</param>
public CapabilityInvoker(
DriverResiliencePipelineBuilder builder,
string driverInstanceId,
Func<DriverResilienceOptions> optionsAccessor,
string driverType = "Unknown",
DriverResilienceStatusTracker? statusTracker = null)
{
ArgumentNullException.ThrowIfNull(builder);
ArgumentNullException.ThrowIfNull(optionsAccessor);
_builder = builder;
_driverInstanceId = driverInstanceId;
_driverType = driverType;
_optionsAccessor = optionsAccessor;
_statusTracker = statusTracker;
}
/// <summary>Execute a capability call returning a value, honoring the per-capability pipeline.</summary>
/// <typeparam name="TResult">Return type of the underlying driver call.</typeparam>
public async ValueTask<TResult> ExecuteAsync<TResult>(
DriverCapability capability,
string hostName,
Func<CancellationToken, ValueTask<TResult>> callSite,
CancellationToken cancellationToken)
{
ArgumentNullException.ThrowIfNull(callSite);
var pipeline = ResolvePipeline(capability, hostName);
_statusTracker?.RecordCallStart(_driverInstanceId, hostName);
try
{
using (LogContextEnricher.Push(_driverInstanceId, _driverType, capability, LogContextEnricher.NewCorrelationId()))
{
return await pipeline.ExecuteAsync(callSite, cancellationToken).ConfigureAwait(false);
}
}
finally
{
_statusTracker?.RecordCallComplete(_driverInstanceId, hostName);
}
}
/// <summary>Execute a void-returning capability call, honoring the per-capability pipeline.</summary>
public async ValueTask ExecuteAsync(
DriverCapability capability,
string hostName,
Func<CancellationToken, ValueTask> callSite,
CancellationToken cancellationToken)
{
ArgumentNullException.ThrowIfNull(callSite);
var pipeline = ResolvePipeline(capability, hostName);
_statusTracker?.RecordCallStart(_driverInstanceId, hostName);
try
{
using (LogContextEnricher.Push(_driverInstanceId, _driverType, capability, LogContextEnricher.NewCorrelationId()))
{
await pipeline.ExecuteAsync(callSite, cancellationToken).ConfigureAwait(false);
}
}
finally
{
_statusTracker?.RecordCallComplete(_driverInstanceId, hostName);
}
}
/// <summary>
/// Execute a <see cref="DriverCapability.Write"/> call honoring <see cref="WriteIdempotentAttribute"/>
/// semantics — if <paramref name="isIdempotent"/> is <c>false</c>, retries are disabled regardless
/// of the tag-level configuration (the pipeline for a non-idempotent write never retries per
/// decisions #44-45). If <c>true</c>, the call runs through the capability's pipeline which may
/// retry when the tier configuration permits.
/// </summary>
public async ValueTask<TResult> ExecuteWriteAsync<TResult>(
string hostName,
bool isIdempotent,
Func<CancellationToken, ValueTask<TResult>> callSite,
CancellationToken cancellationToken)
{
ArgumentNullException.ThrowIfNull(callSite);
if (!isIdempotent)
{
var noRetryOptions = _optionsAccessor() with
{
CapabilityPolicies = new Dictionary<DriverCapability, CapabilityPolicy>
{
[DriverCapability.Write] = _optionsAccessor().Resolve(DriverCapability.Write) with { RetryCount = 0 },
},
};
var pipeline = _builder.GetOrCreate(_driverInstanceId, $"{hostName}::non-idempotent", DriverCapability.Write, noRetryOptions);
using (LogContextEnricher.Push(_driverInstanceId, _driverType, DriverCapability.Write, LogContextEnricher.NewCorrelationId()))
{
return await pipeline.ExecuteAsync(callSite, cancellationToken).ConfigureAwait(false);
}
}
return await ExecuteAsync(DriverCapability.Write, hostName, callSite, cancellationToken).ConfigureAwait(false);
}
private ResiliencePipeline ResolvePipeline(DriverCapability capability, string hostName) =>
_builder.GetOrCreate(_driverInstanceId, hostName, capability, _optionsAccessor());
}