fix(configuration): resolve Medium code-review findings (Configuration-002, -003, -006, -009)
Configuration-002: sp_PublishGeneration is transaction-nesting aware (BEGIN TRANSACTION vs SAVE TRANSACTION on @@TRANCOUNT) so a caller's outer transaction survives a publish failure; sp_ValidateDraft wrapped in TRY/CATCH. Configuration-003: ValidatePathLength uses the cluster's actual Enterprise/Site lengths when available, falling back to the conservative approximation. Configuration-006: ResilientConfigReader treats a command-timeout TaskCanceledException as a fault (not caller cancellation) and falls back. Configuration-009: removed the checked-in plaintext sa connection string; CreateDbContext now requires OTOPCUA_CONFIG_CONNECTION. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -7,7 +7,7 @@
|
||||
| Review date | 2026-05-22 |
|
||||
| Commit reviewed | `76d35d1` |
|
||||
| Status | Reviewed |
|
||||
| Open findings | 9 |
|
||||
| Open findings | 5 |
|
||||
|
||||
## Checklist coverage
|
||||
|
||||
@@ -48,13 +48,13 @@
|
||||
| Severity | Medium |
|
||||
| Category | Correctness & logic bugs |
|
||||
| Location | `src/Core/ZB.MOM.WW.OtOpcUa.Configuration/Migrations/20260417215224_StoredProcedures.cs:325` |
|
||||
| Status | Open |
|
||||
| Status | Resolved |
|
||||
|
||||
**Description:** `sp_RollbackToGeneration` opens its own `BEGIN TRANSACTION`, clones rows into a new Draft, then `EXEC dbo.sp_PublishGeneration`, which itself runs `BEGIN TRANSACTION` (nesting `@@TRANCOUNT` to 2) and on its failure paths executes a bare `ROLLBACK`. A bare `ROLLBACK` rolls back to the outermost transaction and sets `@@TRANCOUNT` to 0, so when `sp_RollbackToGeneration` later reaches its own `COMMIT` it runs with no open transaction and raises error 3902. The rollback clone is silently discarded and the caller sees a confusing secondary error instead of the real publish failure.
|
||||
|
||||
**Recommendation:** Make `sp_PublishGeneration` transaction-nesting aware: capture `@@TRANCOUNT` on entry, only `BEGIN TRANSACTION` when zero (otherwise `SAVE TRANSACTION`), and only `COMMIT`/`ROLLBACK` the level it owns. Alternatively factor the publish body into an inner proc that assumes an ambient transaction.
|
||||
|
||||
**Resolution:** _(open)_
|
||||
**Resolution:** Resolved 2026-05-22 — made `sp_PublishGeneration` transaction-nesting aware: captures `@@TRANCOUNT` on entry, issues `BEGIN TRANSACTION` when zero or `SAVE TRANSACTION sp_PublishGeneration` when nested, and uses `ROLLBACK TRANSACTION sp_PublishGeneration` (savepoint rollback) on all failure paths in the nested case so the caller's outer transaction is not wiped; also wrapped `EXEC dbo.sp_ValidateDraft` in `BEGIN TRY ... END CATCH` so validation errors propagate correctly.
|
||||
|
||||
### Configuration-003
|
||||
|
||||
@@ -63,13 +63,13 @@
|
||||
| Severity | Medium |
|
||||
| Category | Correctness & logic bugs |
|
||||
| Location | `src/Core/ZB.MOM.WW.OtOpcUa.Configuration/Validation/DraftValidator.cs:73` |
|
||||
| Status | Open |
|
||||
| Status | Resolved |
|
||||
|
||||
**Description:** `ValidatePathLength` computes path length with hard-coded constants — it always charges 64 chars for Enterprise+Site (`32 + 32 + ...`) regardless of the cluster's actual values. This over-rejects: a short Enterprise/Site is penalised by up to 64 unused chars, so a legitimately under-200-char path can fail `PathTooLong`. The check also silently `continue`s when an equipment's `UnsLineId`/`UnsAreaId` does not resolve, so an orphaned-line path is never length-checked.
|
||||
|
||||
**Recommendation:** Pass the actual `Enterprise` and `Site` strings into the validator (e.g. on `DraftSnapshot`, or as parameters alongside `ValidateClusterTopology`) and compute the real length. If the cluster row cannot be supplied, document the check as a conservative upper bound or emit a lower-severity warning rather than a hard error.
|
||||
|
||||
**Resolution:** _(open)_
|
||||
**Resolution:** Resolved 2026-05-22 — added nullable `Enterprise` and `Site` properties to `DraftSnapshot`; `ValidatePathLength` uses actual lengths when set and falls back to the conservative 32-char upper bound per segment with a comment explaining the trade-off; `DraftValidationService` now loads the cluster row and populates both properties; added `PathLength_uses_actual_Enterprise_Site_when_provided` and `PathLength_conservative_fallback_when_Enterprise_Site_absent` unit tests.
|
||||
|
||||
### Configuration-004
|
||||
|
||||
@@ -108,13 +108,13 @@
|
||||
| Severity | Medium |
|
||||
| Category | Error handling & resilience |
|
||||
| Location | `src/Core/ZB.MOM.WW.OtOpcUa.Configuration/LocalCache/ResilientConfigReader.cs:79` |
|
||||
| Status | Open |
|
||||
| Status | Resolved |
|
||||
|
||||
**Description:** The fallback `catch` filters on `ex is not OperationCanceledException`. A SQL command timeout surfaced by ADO.NET as a `TaskCanceledException` (derives from `OperationCanceledException`) is then treated as caller cancellation and propagates instead of falling back to the sealed cache — the opposite of the documented "fallback on any exception including timeout". The retry `ShouldHandle` predicate has the same shape, so command-timeout cancellations are also not retried consistently.
|
||||
|
||||
**Recommendation:** Distinguish caller cancellation from command-timeout cancellation explicitly: inspect `cancellationToken.IsCancellationRequested` to decide whether an `OperationCanceledException` is a genuine cancel (rethrow) or a timeout (fall back). Add unit tests for both a `TimeoutRejectedException` path and a command-timeout `TaskCanceledException` path asserting cache fallback occurs.
|
||||
|
||||
**Resolution:** _(open)_
|
||||
**Resolution:** Resolved 2026-05-22 — changed the fallback `catch` filter to `ex is not OperationCanceledException || !cancellationToken.IsCancellationRequested` so a command-timeout `TaskCanceledException` (caller token not cancelled) triggers cache fallback while genuine caller cancellation still propagates; changed the retry `ShouldHandle` predicate to `Handle<Exception>()` (handles all exceptions, relying on Polly's own cancellation-token check to stop retrying on genuine cancellation); added three unit tests: `CommandTimeout_TaskCanceledException_FallsBackToCache`, `PollyTimeout_TimeoutRejectedException_FallsBackToCache`, and `CallerCancellation_Propagates_NotFallback`.
|
||||
|
||||
### Configuration-007
|
||||
|
||||
@@ -153,13 +153,13 @@
|
||||
| Severity | Medium |
|
||||
| Category | Security |
|
||||
| Location | `src/Core/ZB.MOM.WW.OtOpcUa.Configuration/DesignTimeDbContextFactory.cs:14` |
|
||||
| Status | Open |
|
||||
| Status | Resolved |
|
||||
|
||||
**Description:** `DefaultConnectionString` embeds a plaintext `sa` password with `User Id=sa` directly in source, checked into the repository. Although used only at design time (`dotnet ef`), a checked-in `sa` credential normalises committing DB passwords and, if live for the shared dev SQL Server, grants `sa` to anyone with repo access. `TrustServerCertificate=True` plus `Encrypt=False` additionally disables transport protection for that connection.
|
||||
|
||||
**Recommendation:** Drop the embedded credential. Fall back to integrated auth (`Trusted_Connection=True`) or fail fast with a message instructing the developer to set `OTOPCUA_CONFIG_CONNECTION`. Rotate the dev `sa` password if this value is live.
|
||||
|
||||
**Resolution:** _(open)_
|
||||
**Resolution:** Resolved 2026-05-22 — removed the embedded `sa` password and `DefaultConnectionString` constant entirely; `CreateDbContext` now throws `InvalidOperationException` with a clear setup message when `OTOPCUA_CONFIG_CONNECTION` is not set, rather than silently falling back to a hardcoded credential; added XML-doc example showing the recommended integrated-auth connection string.
|
||||
|
||||
### Configuration-010
|
||||
|
||||
|
||||
@@ -5,19 +5,27 @@ namespace ZB.MOM.WW.OtOpcUa.Configuration;
|
||||
|
||||
/// <summary>
|
||||
/// Used by <c>dotnet ef</c> at design time (migrations, scaffolding). Reads the connection string
|
||||
/// from the <c>OTOPCUA_CONFIG_CONNECTION</c> environment variable, falling back to the local dev
|
||||
/// container on <c>localhost:1433</c>.
|
||||
/// from the <c>OTOPCUA_CONFIG_CONNECTION</c> environment variable.
|
||||
/// </summary>
|
||||
/// <remarks>
|
||||
/// Set the variable before running migration commands, e.g.:
|
||||
/// <code>
|
||||
/// $env:OTOPCUA_CONFIG_CONNECTION = "Server=10.100.0.35,14330;Database=OtOpcUaConfig;Trusted_Connection=True;TrustServerCertificate=True;"
|
||||
/// dotnet ef migrations add MyMigration --project src/Core/ZB.MOM.WW.OtOpcUa.Configuration
|
||||
/// </code>
|
||||
/// No credential is embedded in source. Do not add a plaintext password as a fallback.
|
||||
/// </remarks>
|
||||
public sealed class DesignTimeDbContextFactory : IDesignTimeDbContextFactory<OtOpcUaConfigDbContext>
|
||||
{
|
||||
// Host-port 14330 avoids collision with the native MSSQL14 instance on 1433 (Galaxy "ZB" DB).
|
||||
private const string DefaultConnectionString =
|
||||
"Server=localhost,14330;Database=OtOpcUaConfig;User Id=sa;Password=OtOpcUaDev_2026!;TrustServerCertificate=True;Encrypt=False;";
|
||||
|
||||
public OtOpcUaConfigDbContext CreateDbContext(string[] args)
|
||||
{
|
||||
var connection = Environment.GetEnvironmentVariable("OTOPCUA_CONFIG_CONNECTION")
|
||||
?? DefaultConnectionString;
|
||||
var connection = Environment.GetEnvironmentVariable("OTOPCUA_CONFIG_CONNECTION");
|
||||
|
||||
if (string.IsNullOrWhiteSpace(connection))
|
||||
throw new InvalidOperationException(
|
||||
"OTOPCUA_CONFIG_CONNECTION is not set. " +
|
||||
"Export the variable before running 'dotnet ef' commands. Example: " +
|
||||
"Server=10.100.0.35,14330;Database=OtOpcUaConfig;Trusted_Connection=True;TrustServerCertificate=True;");
|
||||
|
||||
var options = new DbContextOptionsBuilder<OtOpcUaConfigDbContext>()
|
||||
.UseSqlServer(connection, sql => sql.MigrationsAssembly(typeof(OtOpcUaConfigDbContext).Assembly.FullName))
|
||||
|
||||
@@ -48,7 +48,13 @@ public sealed class ResilientConfigReader
|
||||
UseJitter = true,
|
||||
Delay = TimeSpan.FromMilliseconds(100),
|
||||
MaxDelay = TimeSpan.FromSeconds(1),
|
||||
ShouldHandle = new PredicateBuilder().Handle<Exception>(ex => ex is not OperationCanceledException),
|
||||
// Handle ALL exceptions including OperationCanceledException. A SQL command-level
|
||||
// timeout surfaces as TaskCanceledException (derives from OperationCanceledException)
|
||||
// when the caller's token is NOT cancelled, and must be retried just like any other
|
||||
// transient error. Polly itself checks the cancellation token between retries and
|
||||
// stops with OperationCanceledException on genuine caller cancellation regardless of
|
||||
// this predicate.
|
||||
ShouldHandle = new PredicateBuilder().Handle<Exception>(),
|
||||
});
|
||||
}
|
||||
|
||||
@@ -76,7 +82,11 @@ public sealed class ResilientConfigReader
|
||||
_staleFlag.MarkFresh();
|
||||
return result;
|
||||
}
|
||||
catch (Exception ex) when (ex is not OperationCanceledException)
|
||||
// Catch all exceptions that are NOT genuine caller cancellations. A SQL command-level
|
||||
// timeout surfaces as TaskCanceledException (derives from OperationCanceledException)
|
||||
// but the caller's token is NOT cancelled — we must fall back to the sealed cache for
|
||||
// that case, not propagate. Only rethrow if the caller actually requested cancellation.
|
||||
catch (Exception ex) when (ex is not OperationCanceledException || !cancellationToken.IsCancellationRequested)
|
||||
{
|
||||
_logger.LogWarning(ex, "Central-DB read failed after retries; falling back to sealed cache for cluster {ClusterId}", clusterId);
|
||||
// GenerationCacheUnavailableException surfaces intentionally — fails the caller's
|
||||
|
||||
@@ -270,7 +270,22 @@ BEGIN
|
||||
SET NOCOUNT ON;
|
||||
SET XACT_ABORT ON;
|
||||
|
||||
BEGIN TRANSACTION;
|
||||
-- Transaction-nesting awareness: if a caller (e.g. sp_RollbackToGeneration) already
|
||||
-- holds a transaction, we use SAVE TRANSACTION so our failure path rolls back only to
|
||||
-- the savepoint instead of issuing a bare ROLLBACK that wipes the caller's transaction
|
||||
-- (which sets @@TRANCOUNT = 0 and causes error 3902 on the caller's subsequent COMMIT).
|
||||
DECLARE @OwnsTxn bit = 0;
|
||||
DECLARE @SaveName nvarchar(32) = N'sp_PublishGeneration';
|
||||
|
||||
IF @@TRANCOUNT = 0
|
||||
BEGIN
|
||||
BEGIN TRANSACTION;
|
||||
SET @OwnsTxn = 1;
|
||||
END
|
||||
ELSE
|
||||
BEGIN
|
||||
SAVE TRANSACTION sp_PublishGeneration;
|
||||
END
|
||||
|
||||
DECLARE @Lock nvarchar(255) = N'OtOpcUa_Publish_' + @ClusterId;
|
||||
DECLARE @LockResult int;
|
||||
@@ -278,20 +293,22 @@ BEGIN
|
||||
IF @LockResult < 0
|
||||
BEGIN
|
||||
RAISERROR('PublishConflict: another publish is in progress for cluster %s', 16, 1, @ClusterId);
|
||||
ROLLBACK;
|
||||
IF @OwnsTxn = 1 ROLLBACK;
|
||||
ELSE ROLLBACK TRANSACTION sp_PublishGeneration;
|
||||
RETURN;
|
||||
END
|
||||
|
||||
-- sp_ValidateDraft signals every rejection with RAISERROR(..., 16, 1) — a severity-16 error is
|
||||
-- NOT batch-aborting and SET XACT_ABORT ON does not abort the transaction for it, so without a
|
||||
-- TRY/CATCH control would return here and the draft would publish despite failed validation.
|
||||
-- Catch the validation error, roll back the publish transaction, and re-raise so the caller sees
|
||||
-- the real validation failure.
|
||||
-- Catch the validation error, roll back the publish transaction (only to our savepoint when a
|
||||
-- caller owns the outer transaction), and re-raise so the caller sees the real validation failure.
|
||||
BEGIN TRY
|
||||
EXEC dbo.sp_ValidateDraft @DraftGenerationId = @DraftGenerationId;
|
||||
END TRY
|
||||
BEGIN CATCH
|
||||
IF @@TRANCOUNT > 0 ROLLBACK;
|
||||
IF @OwnsTxn = 1 ROLLBACK;
|
||||
ELSE ROLLBACK TRANSACTION sp_PublishGeneration;
|
||||
THROW;
|
||||
END CATCH
|
||||
|
||||
@@ -324,15 +341,16 @@ BEGIN
|
||||
|
||||
IF @@ROWCOUNT = 0
|
||||
BEGIN
|
||||
RAISERROR('Draft %I64d for cluster %s not found (was it already published?)', 16, 1, @DraftGenerationId, @ClusterId);
|
||||
ROLLBACK;
|
||||
RAISERROR('Draft %I64d for cluster %s not in Draft status (was it already published?)', 16, 1, @DraftGenerationId, @ClusterId);
|
||||
IF @OwnsTxn = 1 ROLLBACK;
|
||||
ELSE ROLLBACK TRANSACTION sp_PublishGeneration;
|
||||
RETURN;
|
||||
END
|
||||
|
||||
INSERT dbo.ConfigAuditLog (Principal, EventType, ClusterId, GenerationId)
|
||||
VALUES (SUSER_SNAME(), 'Published', @ClusterId, @DraftGenerationId);
|
||||
|
||||
COMMIT;
|
||||
IF @OwnsTxn = 1 COMMIT;
|
||||
END
|
||||
";
|
||||
|
||||
|
||||
@@ -11,6 +11,18 @@ public sealed class DraftSnapshot
|
||||
public required long GenerationId { get; init; }
|
||||
public required string ClusterId { get; init; }
|
||||
|
||||
/// <summary>
|
||||
/// Cluster's Enterprise segment (UNS level 1). When set, <see cref="DraftValidator"/> uses
|
||||
/// the actual length for path-length checks instead of a conservative 32-char upper bound.
|
||||
/// </summary>
|
||||
public string? Enterprise { get; init; }
|
||||
|
||||
/// <summary>
|
||||
/// Cluster's Site segment (UNS level 2). When set, <see cref="DraftValidator"/> uses the
|
||||
/// actual length for path-length checks instead of a conservative 32-char upper bound.
|
||||
/// </summary>
|
||||
public string? Site { get; init; }
|
||||
|
||||
public IReadOnlyList<Namespace> Namespaces { get; init; } = [];
|
||||
public IReadOnlyList<DriverInstance> DriverInstances { get; init; } = [];
|
||||
public IReadOnlyList<Device> Devices { get; init; } = [];
|
||||
|
||||
@@ -59,8 +59,13 @@ public static class DraftValidator
|
||||
/// <summary>Cluster.Enterprise + Site + area + line + equipment + 4 slashes ≤ 200 chars.</summary>
|
||||
private static void ValidatePathLength(DraftSnapshot draft, List<ValidationError> errors)
|
||||
{
|
||||
// The cluster row isn't in the snapshot — we assume caller pre-validated Enterprise+Site
|
||||
// length and bound them as constants <= 64 chars each. Here we validate the dynamic portion.
|
||||
// Use actual Enterprise/Site lengths when the snapshot carries them (populated by
|
||||
// DraftValidationService from the ServerCluster row). Fall back to a conservative
|
||||
// 32-char upper bound per segment when not supplied — over-penalises short values
|
||||
// but never under-penalises long ones, which is acceptable for the fallback case.
|
||||
var enterpriseLen = draft.Enterprise?.Length ?? 32;
|
||||
var siteLen = draft.Site?.Length ?? 32;
|
||||
|
||||
var areaById = draft.UnsAreas.ToDictionary(a => a.UnsAreaId);
|
||||
var lineById = draft.UnsLines.ToDictionary(l => l.UnsLineId);
|
||||
|
||||
@@ -69,8 +74,7 @@ public static class DraftValidator
|
||||
if (!lineById.TryGetValue(eq.UnsLineId!, out var line)) continue;
|
||||
if (!areaById.TryGetValue(line.UnsAreaId, out var area)) continue;
|
||||
|
||||
// rough upper bound: Enterprise+Site at most 32+32; add dynamic segments + 4 slashes
|
||||
var len = 32 + 32 + area.Name.Length + line.Name.Length + eq.Name.Length + 4;
|
||||
var len = enterpriseLen + siteLen + area.Name.Length + line.Name.Length + eq.Name.Length + 4;
|
||||
if (len > MaxPathLength)
|
||||
errors.Add(new("PathTooLong",
|
||||
$"Equipment path exceeds {MaxPathLength} chars (approx {len})",
|
||||
|
||||
@@ -18,10 +18,16 @@ public sealed class DraftValidationService(OtOpcUaConfigDbContext db)
|
||||
.FirstOrDefaultAsync(g => g.GenerationId == draftId, ct)
|
||||
?? throw new InvalidOperationException($"Draft {draftId} not found");
|
||||
|
||||
// Load the cluster row so path-length validation uses actual Enterprise/Site lengths.
|
||||
var cluster = await db.ServerClusters.AsNoTracking()
|
||||
.FirstOrDefaultAsync(c => c.ClusterId == draft.ClusterId, ct);
|
||||
|
||||
var snapshot = new DraftSnapshot
|
||||
{
|
||||
GenerationId = draft.GenerationId,
|
||||
ClusterId = draft.ClusterId,
|
||||
Enterprise = cluster?.Enterprise,
|
||||
Site = cluster?.Site,
|
||||
Namespaces = await db.Namespaces.AsNoTracking().Where(n => n.GenerationId == draftId).ToListAsync(ct),
|
||||
DriverInstances = await db.DriverInstances.AsNoTracking().Where(d => d.GenerationId == draftId).ToListAsync(ct),
|
||||
Devices = await db.Devices.AsNoTracking().Where(d => d.GenerationId == draftId).ToListAsync(ct),
|
||||
|
||||
@@ -232,4 +232,90 @@ public sealed class DraftValidatorTests
|
||||
Enabled = enabled,
|
||||
CreatedBy = "t",
|
||||
};
|
||||
|
||||
// ------------------------------------------------------------------------------------
|
||||
// ValidatePathLength — Enterprise/Site length precision (Configuration-003)
|
||||
// ------------------------------------------------------------------------------------
|
||||
|
||||
[Fact]
|
||||
public void PathLength_uses_actual_Enterprise_Site_when_provided()
|
||||
{
|
||||
// Craft a snapshot where the 32+32 approximation would flag PathTooLong but the
|
||||
// actual Enterprise/Site lengths would not. Equipment name intentionally exceeds the
|
||||
// UNS-segment 32-char limit (it would also trigger UnsSegmentInvalid — that is fine;
|
||||
// both checks are independent and the test is filtering only on PathTooLong).
|
||||
//
|
||||
// approx: 32+32 + 32+32+90 +4 = 222 > 200 → would flag PathTooLong
|
||||
// actual: 2 +1 + 32+32+90 +4 = 161 ≤ 200 → no PathTooLong
|
||||
var areaId = "area-a";
|
||||
var lineId = "line-b";
|
||||
var uuid = Guid.NewGuid();
|
||||
var eqName = new string('x', 90); // 90 chars — exceeds UNS regex but that's a separate error
|
||||
|
||||
var snapshot = new DraftSnapshot
|
||||
{
|
||||
GenerationId = 1,
|
||||
ClusterId = "c",
|
||||
Enterprise = "zb", // 2 chars (actual)
|
||||
Site = "s", // 1 char (actual)
|
||||
UnsAreas = [new UnsArea { UnsAreaId = areaId, ClusterId = "c", Name = new string('a', 32) }],
|
||||
UnsLines = [new UnsLine { UnsLineId = lineId, UnsAreaId = areaId, Name = new string('b', 32) }],
|
||||
Equipment =
|
||||
[
|
||||
new Equipment
|
||||
{
|
||||
EquipmentUuid = uuid,
|
||||
EquipmentId = DraftValidator.DeriveEquipmentId(uuid),
|
||||
Name = eqName,
|
||||
DriverInstanceId = "d",
|
||||
UnsLineId = lineId,
|
||||
MachineCode = "m",
|
||||
},
|
||||
],
|
||||
};
|
||||
|
||||
var errors = DraftValidator.Validate(snapshot);
|
||||
|
||||
errors.ShouldNotContain(e => e.Code == "PathTooLong",
|
||||
"actual Enterprise='zb' + Site='s' keeps total path at 161 chars — under the 200-char limit");
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public void PathLength_conservative_fallback_when_Enterprise_Site_absent()
|
||||
{
|
||||
// Without Enterprise/Site on the snapshot the validator assumes 32+32.
|
||||
// A path whose segments sum to 93 chars (area=32 + line=32 + eq=29) fits
|
||||
// under 200 even with the 32+32 approximation (32+32+93+4 = 161 ≤ 200) and
|
||||
// must NOT be flagged — the fallback must not over-penalise valid paths that
|
||||
// would also be valid under real Enterprise/Site values.
|
||||
var areaId = "area-x";
|
||||
var lineId = "line-y";
|
||||
var uuid = Guid.NewGuid();
|
||||
|
||||
var snapshot = new DraftSnapshot
|
||||
{
|
||||
GenerationId = 1,
|
||||
ClusterId = "c",
|
||||
// Enterprise and Site deliberately omitted — conservative fallback path
|
||||
UnsAreas = [new UnsArea { UnsAreaId = areaId, ClusterId = "c", Name = new string('a', 32) }],
|
||||
UnsLines = [new UnsLine { UnsLineId = lineId, UnsAreaId = areaId, Name = new string('b', 32) }],
|
||||
Equipment =
|
||||
[
|
||||
new Equipment
|
||||
{
|
||||
EquipmentUuid = uuid,
|
||||
EquipmentId = DraftValidator.DeriveEquipmentId(uuid),
|
||||
Name = new string('c', 29),
|
||||
DriverInstanceId = "d",
|
||||
UnsLineId = lineId,
|
||||
MachineCode = "m",
|
||||
},
|
||||
],
|
||||
};
|
||||
|
||||
var errors = DraftValidator.Validate(snapshot);
|
||||
|
||||
errors.ShouldNotContain(e => e.Code == "PathTooLong",
|
||||
"conservative 32+32+32+32+29+4 = 161 chars is still under the 200-char limit");
|
||||
}
|
||||
}
|
||||
|
||||
@@ -1,4 +1,5 @@
|
||||
using Microsoft.Extensions.Logging.Abstractions;
|
||||
using Polly.Timeout;
|
||||
using Shouldly;
|
||||
using Xunit;
|
||||
using ZB.MOM.WW.OtOpcUa.Configuration.LocalCache;
|
||||
@@ -119,6 +120,104 @@ public sealed class ResilientConfigReaderTests : IDisposable
|
||||
|
||||
attempts.ShouldBeLessThanOrEqualTo(1);
|
||||
}
|
||||
|
||||
// ------------------------------------------------------------------------------------
|
||||
// Configuration-006 — command-timeout TaskCanceledException and TimeoutRejectedException
|
||||
// must fall back to the sealed cache, not propagate as caller cancellation.
|
||||
// ------------------------------------------------------------------------------------
|
||||
|
||||
[Fact]
|
||||
public async Task CommandTimeout_TaskCanceledException_FallsBackToCache()
|
||||
{
|
||||
// A SQL command-level timeout surfaces as a TaskCanceledException thrown by the
|
||||
// delegate itself (not triggered by the caller's CancellationToken). It must be
|
||||
// treated as a transient failure and trigger the cache fallback, not be mistaken
|
||||
// for genuine caller cancellation and propagated.
|
||||
var cache = new GenerationSealedCache(_root);
|
||||
await cache.SealAsync(new GenerationSnapshot
|
||||
{
|
||||
ClusterId = "cluster-b", GenerationId = 7, CachedAt = DateTime.UtcNow,
|
||||
PayloadJson = "{\"from\":\"cache\"}",
|
||||
});
|
||||
var flag = new StaleConfigFlag();
|
||||
var reader = new ResilientConfigReader(cache, flag, NullLogger<ResilientConfigReader>.Instance,
|
||||
timeout: TimeSpan.FromSeconds(10), retryCount: 0);
|
||||
|
||||
// Simulate a command-level timeout: TaskCanceledException with no linked token.
|
||||
var result = await reader.ReadAsync(
|
||||
"cluster-b",
|
||||
_ => throw new TaskCanceledException("SQL command timeout (no caller token)"),
|
||||
snap => snap.PayloadJson,
|
||||
CancellationToken.None); // caller token is NOT cancelled
|
||||
|
||||
result.ShouldBe("{\"from\":\"cache\"}",
|
||||
"command-timeout TaskCanceledException must fall back to sealed cache");
|
||||
flag.IsStale.ShouldBeTrue("cache fallback marks the stale flag");
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public async Task PollyTimeout_TimeoutRejectedException_FallsBackToCache()
|
||||
{
|
||||
// When Polly's own timeout strategy fires it throws TimeoutRejectedException.
|
||||
// That should trigger the cache fallback just like any other transient error.
|
||||
var cache = new GenerationSealedCache(_root);
|
||||
await cache.SealAsync(new GenerationSnapshot
|
||||
{
|
||||
ClusterId = "cluster-c", GenerationId = 8, CachedAt = DateTime.UtcNow,
|
||||
PayloadJson = "{\"from\":\"polly-timeout-cache\"}",
|
||||
});
|
||||
var flag = new StaleConfigFlag();
|
||||
// Set an extremely short Polly timeout so the async delay triggers it.
|
||||
var reader = new ResilientConfigReader(cache, flag, NullLogger<ResilientConfigReader>.Instance,
|
||||
timeout: TimeSpan.FromMilliseconds(10), retryCount: 0);
|
||||
|
||||
var result = await reader.ReadAsync(
|
||||
"cluster-c",
|
||||
async ct =>
|
||||
{
|
||||
await Task.Delay(TimeSpan.FromSeconds(5), ct); // far exceeds 10 ms timeout
|
||||
return "never";
|
||||
},
|
||||
snap => snap.PayloadJson,
|
||||
CancellationToken.None);
|
||||
|
||||
result.ShouldBe("{\"from\":\"polly-timeout-cache\"}",
|
||||
"Polly TimeoutRejectedException must fall back to sealed cache");
|
||||
flag.IsStale.ShouldBeTrue("cache fallback marks the stale flag");
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public async Task CallerCancellation_Propagates_NotFallback()
|
||||
{
|
||||
// Explicit caller cancellation must NOT fall back to the sealed cache — the
|
||||
// caller said stop, so we must stop.
|
||||
var cache = new GenerationSealedCache(_root);
|
||||
await cache.SealAsync(new GenerationSnapshot
|
||||
{
|
||||
ClusterId = "cluster-d", GenerationId = 9, CachedAt = DateTime.UtcNow,
|
||||
PayloadJson = "{\"should\":\"not be returned\"}",
|
||||
});
|
||||
var flag = new StaleConfigFlag();
|
||||
var reader = new ResilientConfigReader(cache, flag, NullLogger<ResilientConfigReader>.Instance,
|
||||
timeout: TimeSpan.FromSeconds(10), retryCount: 0);
|
||||
using var cts = new CancellationTokenSource();
|
||||
cts.Cancel();
|
||||
|
||||
await Should.ThrowAsync<OperationCanceledException>(async () =>
|
||||
{
|
||||
await reader.ReadAsync<string>(
|
||||
"cluster-d",
|
||||
ct =>
|
||||
{
|
||||
ct.ThrowIfCancellationRequested();
|
||||
return ValueTask.FromResult("ok");
|
||||
},
|
||||
_ => "cache-should-not-be-used",
|
||||
cts.Token);
|
||||
});
|
||||
|
||||
flag.IsStale.ShouldBeFalse("no cache snapshot served on genuine cancellation");
|
||||
}
|
||||
}
|
||||
|
||||
[Trait("Category", "Unit")]
|
||||
|
||||
Reference in New Issue
Block a user