review(Configuration): fix LiteDB global BsonMapper cross-instance race (High)

Re-review at 7286d320. Configuration-012 (High): LiteDbConfigCache/GenerationSealedCache built
LiteDatabase on the process-wide BsonMapper.Global whose lazy member resolution races across
concurrently-constructed DBs (NotSupportedException/duplicate-key under contention; also caused
intermittent suite flakiness). Fix: per-cache fresh BsonMapper + pre-registered entity + TDD.
-013 (dead ValidateClusterTopology, ControlPlane) / -014 (collation case-sensitivity, needs
migration) deferred. No migration touched.
This commit is contained in:
Joseph Doherty
2026-06-19 11:06:56 -04:00
parent 145b06bec9
commit c3d148e396
4 changed files with 142 additions and 6 deletions
@@ -129,6 +129,51 @@ public sealed class LiteDbConfigCacheTests : IDisposable
$"PutAsync must upsert atomically — found {gen42Count} rows for (c-1, gen=42) after 64 concurrent puts");
}
// ------------------------------------------------------------------------------------
// Configuration-012 — the per-instance _writeGate (Configuration-005) does not protect
// against LiteDB's process-wide BsonMapper.Global lazy-init race. Many cache INSTANCES
// constructed + driven concurrently corrupt the shared global mapper, surfacing as
// "Member ClusterId not found on BsonMapper" or a bogus "duplicate key _id = 0". A private
// per-database mapper with the entity pre-registered fixes it.
// ------------------------------------------------------------------------------------
/// <summary>Verifies that many cache instances constructed and driven concurrently do not
/// corrupt LiteDB's shared global BsonMapper — each Put/Get round-trips its own payload and
/// no insert throws a member-not-found or duplicate-_id exception.</summary>
[Fact]
public async Task Concurrent_cache_instances_do_not_race_the_shared_bson_mapper()
{
var paths = new List<string>();
try
{
var outer = Enumerable.Range(0, 24).Select(i => Task.Run(async () =>
{
var path = Path.Combine(Path.GetTempPath(), $"otopcua-cache-mapperrace-{Guid.NewGuid():N}.db");
lock (paths) paths.Add(path);
using var cache = new LiteDbConfigCache(path);
// Pre-seed a sentinel, then hammer one (cluster, gen) from many threads.
await cache.PutAsync(Snapshot($"c-{i}", 99));
var inner = Enumerable.Range(0, 16)
.Select(_ => Task.Run(() => cache.PutAsync(Snapshot($"c-{i}", 42))))
.ToArray();
await Task.WhenAll(inner);
var got = await cache.GetMostRecentAsync($"c-{i}");
got.ShouldNotBeNull();
got!.GenerationId.ShouldBe(99); // 99 > 42, latest by GenerationId
})).ToArray();
// The unfixed code throws LiteException / NotSupportedException out of these tasks under
// the global-mapper race; the fixed code completes cleanly.
await Task.WhenAll(outer);
}
finally
{
foreach (var p in paths)
if (File.Exists(p)) File.Delete(p);
}
}
/// <summary>Verifies that a corrupted cache file surfaces as LocalConfigCacheCorruptException.</summary>
[Fact]
public void Corrupt_file_surfaces_as_LocalConfigCacheCorruptException()