diff --git a/docs/plans/2026-06-16-native-typed-json-design.md b/docs/plans/2026-06-16-native-typed-json-design.md new file mode 100644 index 00000000..07573fc3 --- /dev/null +++ b/docs/plans/2026-06-16-native-typed-json-design.md @@ -0,0 +1,64 @@ +# Native-Typed JSON for List Attribute Values — Design + +**Date:** 2026-06-16 +**Status:** Approved (brainstorming) — ready for implementation plan +**Branch:** `feature/native-typed-json` + +## Problem + +The multi-value (List) attribute feature (shipped 2026-06-16, branch `feature/multivalue-attribute`) stores List values via `AttributeValueCodec` as a JSON **array of strings** — e.g. an `Int32` list is `["10","20","30"]` and a `Boolean` list is `["True","False"]`. This is internally consistent and round-trips, but it is not "native-typed" JSON: numbers and booleans are quoted, and `DateTime` uses a US-invariant format rather than ISO-8601. We want the canonical form to be native-typed (`[10,20,30]`, `[true,false]`, ISO dates), while existing persisted data is normalized to the new form (no dual-format data left behind). + +## Decisions + +| Decision | Choice | +|---|---| +| Encode form | Native-typed JSON: numbers/bools unquoted, strings quoted, `DateTime` as ISO-8601 string | +| Decode | **Read both** old (array-of-strings) and new (native) forms — backward compatible | +| Existing data | **Migrate** to native form across MS SQL + site SQLite + on bundle import (Approach B, thorough) | +| MS SQL mechanism | Idempotent C# **startup normalizer** (not T-SQL — type-aware JSON re-emission is fragile in SQL) | +| Site SQLite mechanism | **Active** normalization in the InstanceActor override-load path (it already has the element type) | +| Bundles | Normalize **on import** (already-exported files are external/unreachable) | + +**Reality note:** the List feature shipped this session and was not deployed to the docker cluster, so there is almost certainly **zero** old-form List data in any store yet. The migration is a safety net guaranteeing no dual-format data ever lingers, not a fix for existing broken data. + +## Architecture + +### 1. Codec (`AttributeValueCodec`, Commons) — foundation + +- **`Encode`** — only the list branch changes. Instead of mapping each element to an invariant string then serializing, serialize the typed CLR collection directly (`JsonSerializer.Serialize(enumerable, …)`) so `System.Text.Json` emits each element in its native JSON kind. STJ numbers/bools are culture-invariant by spec; `DateTime` serializes as round-trippable ISO-8601. Scalars (the `string` / `IFormattable` branches) are untouched. +- **`Decode`** — read both forms. Deserialize to `JsonElement[]` (instead of `string?[]`); for each element feed `ParseScalar` either `GetString()` (JSON string element) or `GetRawText()` (number/bool element). So `[10,20]` and `["10","20"]` both decode to `List{10,20}`; ISO and old US-invariant `DateTime` strings both parse via the existing `DateTime.Parse(…, RoundtripKind)`. A JSON `null` element still throws `FormatException` ("elements may not be null"), unchanged. +- The read-both Decode is also what makes the migration idempotent: re-encoding an already-native value yields identical bytes. + +### 2. MS SQL — idempotent central startup normalizer + +A normalization step invoked once after `dbContext.Database.MigrateAsync(...)` in `MigrationHelper.ApplyOrValidateMigrationsAsync` (active central node only). For each List row: + +- **`TemplateAttributes`** where `DataType = 'List'`: read `Value` + the row's own `ElementDataType`; compute `Encode(Decode(value, List, elementType))`; if it differs from the stored string, `UPDATE`. +- **`InstanceAttributeOverrides`** for List attributes: these rows may have a null `ElementDataType` (it is currently informational — see follow-up #93/M3), so resolve the element type via the owning instance's template attribute (instance → `TemplateId` → `TemplateAttribute` by name → `ElementDataType`). Then re-encode as above. + +Idempotent (native→native is a no-op `UPDATE`-skip), so the step is safe to leave in permanently and cheap on every subsequent startup (a scan, no writes). Per-row failures (malformed JSON, unresolved element type) are logged and skipped — normalization NEVER aborts startup (mirrors the audit/best-effort principle). The scan is bounded to List rows only. + +### 3. Site SQLite — active normalization on override load + +Site static-override values (`SiteStorageService`) are keyed by `(instance, canonicalName)` and carry no element type — the element type lives in the instance's flattened config. The natural normalization point is therefore the **InstanceActor override-load path** (`HandleOverridesLoaded`, added in MV-7), which already decodes both forms using the `ResolvedAttribute`'s `ElementDataType`. Extend it so that when a List override's stored string is in old form (i.e. `Encode(decoded)` differs from the stored string), it re-persists the native form via `SiteStorageService.SetStaticOverrideAsync`. This actively normalizes on every instance load (site startup / failover), reuses the existing decode + element type, and is idempotent. + +### 4. Bundles — normalize on import + +Already-exported `.bundle` files are external artifacts we cannot reach to rewrite; import already reads both forms via the codec. To ensure imported List values land in native form in the DB, the importer re-encodes List attribute `Value`s through the codec when writing (and the MS SQL normalizer is a backstop on next startup). No file rewriting. + +## Error handling + +- Decode of a genuinely malformed value still throws `FormatException`; the normalizers catch it per-row, log, and skip (no startup abort, no actor crash). +- The codec change is additive on the wire (`gRPC` `string value` field unchanged; `List` is a new type with no external wire consumer relying on the old quoted form). + +## Testing + +- **Codec:** native-form encode per element type (`[10,20,30]`, `[true,false]`, ISO `DateTime`, strings stay `["a","b"]`); old-form backward-compat decode (`["10","20"]` → `List`); round-trip for every element type; malformed still throws; culture-invariance preserved. +- **MS SQL normalizer:** old-form row → rewritten to native; native row → untouched (idempotency); malformed row → skipped + logged, other rows still processed; override row element-type resolved via template attribute. +- **Site SQLite / InstanceActor:** an old-form List override on load → re-persisted native (assert `SetStaticOverrideAsync` called with native form); a native override → not re-persisted (idempotent); scalar overrides unaffected. +- **Bundle import:** importing an old-form bundle lands native-form Values in the DB. + +## Out of scope / follow-ups + +- Rewriting already-exported bundle files (unreachable). +- This pairs naturally with follow-up **#93/M3** (populate `InstanceAttributeOverride.ElementDataType` on write); if done, the override normalizer could read the column directly instead of joining to the template attribute. Not required here.