Phase 6.1 Stream A follow-up — DriverInstance.ResilienceConfig JSON column + parser + OtOpcUaServer wire-in

Closes the Phase 6.1 Stream A.2 "per-instance overrides bound from
DriverInstance.ResilienceConfig JSON column" work flagged as a follow-up
when Stream A.1 shipped in PR #78. Every driver can now override its Polly
pipeline policy per instance instead of inheriting pure tier defaults.

Configuration:
- DriverInstance entity gains a nullable `ResilienceConfig` string column
  (nvarchar(max)) + SQL check constraint `CK_DriverInstance_ResilienceConfig_IsJson`
  that enforces ISJSON when not null. Null = use tier defaults (decision
  #143 / unchanged from pre-Phase-6.1).
- EF migration `20260419161008_AddDriverInstanceResilienceConfig`.
- SchemaComplianceTests expected-constraint list gains the new CK name.

Core.Resilience.DriverResilienceOptionsParser:
- Pure-function parser. ParseOrDefaults(tier, json, out diag) returns the
  effective DriverResilienceOptions — tier defaults with per-capability /
  bulkhead overrides layered on top when the JSON payload supplies them.
  Partial policies (e.g. Read { retryCount: 10 }) fill missing fields from
  the tier default for that capability.
- Malformed JSON falls back to pure tier defaults + surfaces a human-readable
  diagnostic via the out parameter. Callers log the diag but don't fail
  startup — a misconfigured ResilienceConfig must not brick a working
  driver.
- Property names + capability keys are case-insensitive; unrecognised
  capability names are logged-and-skipped; unrecognised shape-level keys
  are ignored so future shapes land without a migration.

Server wire-in:
- OtOpcUaServer gains two optional ctor params: `tierLookup` (driverType →
  DriverTier) + `resilienceConfigLookup` (driverInstanceId → JSON string).
  CreateMasterNodeManager now resolves tier + JSON for each driver, parses
  via DriverResilienceOptionsParser, logs the diagnostic if any, and
  constructs CapabilityInvoker with the merged options instead of pure
  Tier A defaults.
- OpcUaApplicationHost threads both lookups through. Default null keeps
  existing tests constructing without either Func unchanged (falls back
  to Tier A + tier defaults exactly as before).

Tests (13 new DriverResilienceOptionsParserTests):
- null / whitespace / empty-object JSON returns pure tier defaults.
- Malformed JSON falls back + surfaces diagnostic.
- Read override merged into tier defaults; other capabilities untouched.
- Partial policy fills missing fields from tier default.
- Bulkhead overrides honored.
- Unknown capability skipped + surfaced in diagnostic.
- Property names + capability keys are case-insensitive.
- Every tier × every capability × empty-JSON round-trips tier defaults
  exactly (theory).

Full solution dotnet test: 1215 passing (was 1202, +13). Pre-existing
Client.CLI Subscribe flake unchanged.

Production wiring (Program.cs) example:
  Func<string, DriverTier> tierLookup = type => type switch
  {
      "Galaxy" => DriverTier.C,
      "Modbus" or "S7" => DriverTier.B,
      "OpcUaClient" => DriverTier.A,
      _ => DriverTier.A,
  };
  Func<string, string?> cfgLookup = id =>
      db.DriverInstances.AsNoTracking().FirstOrDefault(x => x.DriverInstanceId == id)?.ResilienceConfig;
  var host = new OpcUaApplicationHost(..., tierLookup: tierLookup, resilienceConfigLookup: cfgLookup);

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
Joseph Doherty
2026-04-19 12:21:42 -04:00
parent eac457fa7c
commit 7b50118b68
10 changed files with 1719 additions and 7 deletions

View File

@@ -0,0 +1,116 @@
using System.Text.Json;
using ZB.MOM.WW.OtOpcUa.Core.Abstractions;
namespace ZB.MOM.WW.OtOpcUa.Core.Resilience;
/// <summary>
/// Parses the <c>DriverInstance.ResilienceConfig</c> JSON column into a
/// <see cref="DriverResilienceOptions"/> instance layered on top of the tier defaults.
/// Every key in the JSON is optional; missing keys fall back to the tier defaults from
/// <see cref="DriverResilienceOptions.GetTierDefaults(DriverTier)"/>.
/// </summary>
/// <remarks>
/// <para>Example JSON shape per Phase 6.1 Stream A.2:</para>
/// <code>
/// {
/// "bulkheadMaxConcurrent": 16,
/// "bulkheadMaxQueue": 64,
/// "capabilityPolicies": {
/// "Read": { "timeoutSeconds": 5, "retryCount": 5, "breakerFailureThreshold": 3 },
/// "Write": { "timeoutSeconds": 5, "retryCount": 0, "breakerFailureThreshold": 5 }
/// }
/// }
/// </code>
///
/// <para>Unrecognised keys + values are ignored so future shapes land without a migration.
/// Per-capability overrides are layered on top of tier defaults — a partial policy (only
/// some of TimeoutSeconds/RetryCount/BreakerFailureThreshold) fills in the other fields
/// from the tier default for that capability.</para>
///
/// <para>Parser failures (malformed JSON, type mismatches) fall back to pure tier defaults
/// + surface through an out-parameter diagnostic. Callers may log the diagnostic but should
/// NOT fail driver startup — a misconfigured ResilienceConfig should never brick a
/// working driver.</para>
/// </remarks>
public static class DriverResilienceOptionsParser
{
private static readonly JsonSerializerOptions JsonOpts = new()
{
PropertyNameCaseInsensitive = true,
AllowTrailingCommas = true,
ReadCommentHandling = JsonCommentHandling.Skip,
};
/// <summary>
/// Parse the JSON payload layered on <paramref name="tier"/>'s defaults. Returns the
/// effective options; <paramref name="parseDiagnostic"/> is null on success, or a
/// human-readable error message when the JSON was malformed (options still returned
/// = tier defaults).
/// </summary>
public static DriverResilienceOptions ParseOrDefaults(
DriverTier tier,
string? resilienceConfigJson,
out string? parseDiagnostic)
{
parseDiagnostic = null;
var baseDefaults = DriverResilienceOptions.GetTierDefaults(tier);
var baseOptions = new DriverResilienceOptions { Tier = tier, CapabilityPolicies = baseDefaults };
if (string.IsNullOrWhiteSpace(resilienceConfigJson))
return baseOptions;
ResilienceConfigShape? shape;
try
{
shape = JsonSerializer.Deserialize<ResilienceConfigShape>(resilienceConfigJson, JsonOpts);
}
catch (JsonException ex)
{
parseDiagnostic = $"ResilienceConfig JSON malformed; falling back to tier {tier} defaults. Detail: {ex.Message}";
return baseOptions;
}
if (shape is null) return baseOptions;
var merged = new Dictionary<DriverCapability, CapabilityPolicy>(baseDefaults);
if (shape.CapabilityPolicies is not null)
{
foreach (var (capName, overridePolicy) in shape.CapabilityPolicies)
{
if (!Enum.TryParse<DriverCapability>(capName, ignoreCase: true, out var capability))
{
parseDiagnostic ??= $"Unknown capability '{capName}' in ResilienceConfig; skipped.";
continue;
}
var basePolicy = merged[capability];
merged[capability] = new CapabilityPolicy(
TimeoutSeconds: overridePolicy.TimeoutSeconds ?? basePolicy.TimeoutSeconds,
RetryCount: overridePolicy.RetryCount ?? basePolicy.RetryCount,
BreakerFailureThreshold: overridePolicy.BreakerFailureThreshold ?? basePolicy.BreakerFailureThreshold);
}
}
return new DriverResilienceOptions
{
Tier = tier,
CapabilityPolicies = merged,
BulkheadMaxConcurrent = shape.BulkheadMaxConcurrent ?? baseOptions.BulkheadMaxConcurrent,
BulkheadMaxQueue = shape.BulkheadMaxQueue ?? baseOptions.BulkheadMaxQueue,
};
}
private sealed class ResilienceConfigShape
{
public int? BulkheadMaxConcurrent { get; set; }
public int? BulkheadMaxQueue { get; set; }
public Dictionary<string, CapabilityPolicyShape>? CapabilityPolicies { get; set; }
}
private sealed class CapabilityPolicyShape
{
public int? TimeoutSeconds { get; set; }
public int? RetryCount { get; set; }
public int? BreakerFailureThreshold { get; set; }
}
}