8.2 KiB
Native-Typed JSON for List Attribute Values — Design
Date: 2026-06-16
Status: Implemented (NJ-1 … NJ-6, branch feature/native-typed-json) — full solution builds 0/0; feature-targeted tests green across Commons, TemplateEngine, ConfigurationDatabase, SiteRuntime, and Transport. Follow-up #93/M3 (populate InstanceAttributeOverride.ElementDataType on write) was folded into NJ-2 so the central normalizer can read the override element type directly.
Branch: feature/native-typed-json
Problem
The multi-value (List) attribute feature (shipped 2026-06-16, branch feature/multivalue-attribute) stores List values via AttributeValueCodec as a JSON array of strings — e.g. an Int32 list is ["10","20","30"] and a Boolean list is ["True","False"]. This is internally consistent and round-trips, but it is not "native-typed" JSON: numbers and booleans are quoted, and DateTime uses a US-invariant format rather than ISO-8601. We want the canonical form to be native-typed ([10,20,30], [true,false], ISO dates), while existing persisted data is normalized to the new form (no dual-format data left behind).
Decisions
| Decision | Choice |
|---|---|
| Encode form | Native-typed JSON: numbers/bools unquoted, strings quoted, DateTime as ISO-8601 string |
| Decode | Read both old (array-of-strings) and new (native) forms — backward compatible |
| Existing data | Migrate to native form across MS SQL + site SQLite + on bundle import (Approach B, thorough) |
| MS SQL mechanism | Idempotent C# startup normalizer (not T-SQL — type-aware JSON re-emission is fragile in SQL) |
| Site SQLite mechanism | Active normalization in the InstanceActor override-load path (it already has the element type) |
| Bundles | Normalize on import (already-exported files are external/unreachable) |
Reality note: the List feature shipped this session and was not deployed to the docker cluster, so there is almost certainly zero old-form List data in any store yet. The migration is a safety net guaranteeing no dual-format data ever lingers, not a fix for existing broken data.
Architecture
1. Codec (AttributeValueCodec, Commons) — foundation
Encode— only the list branch changes. Instead of mapping each element to an invariant string then serializing, serialize the typed CLR collection directly (JsonSerializer.Serialize<object>(enumerable, …)) soSystem.Text.Jsonemits each element in its native JSON kind. STJ numbers/bools are culture-invariant by spec;DateTimeserializes as round-trippable ISO-8601. Scalars (thestring/IFormattablebranches) are untouched.Decode— read both forms. Deserialize toJsonElement[](instead ofstring?[]); for each element feedParseScalareitherGetString()(JSON string element) orGetRawText()(number/bool element). So[10,20]and["10","20"]both decode toList<int>{10,20}; ISO and old US-invariantDateTimestrings both parse via the existingDateTime.Parse(…, RoundtripKind). A JSONnullelement still throwsFormatException("elements may not be null"), unchanged.- The read-both Decode is also what makes the migration idempotent: re-encoding an already-native value yields identical bytes.
2. MS SQL — idempotent central startup normalizer
A normalization step invoked once after dbContext.Database.MigrateAsync(...) in MigrationHelper.ApplyOrValidateMigrationsAsync (active central node only). For each List row:
TemplateAttributeswhereDataType = 'List': readValue+ the row's ownElementDataType; computeEncode(Decode(value, List, elementType)); if it differs from the stored string,UPDATE.InstanceAttributeOverridesfor List attributes: these rows may have a nullElementDataType(it is currently informational — see follow-up #93/M3), so resolve the element type via the owning instance's template attribute (instance →TemplateId→TemplateAttributeby name →ElementDataType). Then re-encode as above.
Idempotent (native→native is a no-op UPDATE-skip), so the step is safe to leave in permanently and cheap on every subsequent startup (a scan, no writes). Per-row failures (malformed JSON, unresolved element type) are logged and skipped — normalization NEVER aborts startup (mirrors the audit/best-effort principle). The scan is bounded to List rows only.
3. Site SQLite — active normalization on override load
Site static-override values (SiteStorageService) are keyed by (instance, canonicalName) and carry no element type — the element type lives in the instance's flattened config. The natural normalization point is therefore the InstanceActor override-load path (HandleOverridesLoaded, added in MV-7), which already decodes both forms using the ResolvedAttribute's ElementDataType. Extend it so that when a List override's stored string is in old form (i.e. Encode(decoded) differs from the stored string), it re-persists the native form via SiteStorageService.SetStaticOverrideAsync. This actively normalizes on every instance load (site startup / failover), reuses the existing decode + element type, and is idempotent.
4. Bundles — normalize on import
Already-exported .bundle files are external artifacts we cannot reach to rewrite; import already reads both forms via the codec. To ensure imported List values land in native form in the DB, the importer re-encodes List attribute Values through the codec when writing (and the MS SQL normalizer is a backstop on next startup). No file rewriting.
Error handling
- Decode of a genuinely malformed value still throws
FormatException; the normalizers catch it per-row, log, and skip (no startup abort, no actor crash). - The codec change is additive on the wire (
gRPCstring valuefield unchanged;Listis a new type with no external wire consumer relying on the old quoted form).
Testing
- Codec: native-form encode per element type (
[10,20,30],[true,false], ISODateTime, strings stay["a","b"]); old-form backward-compat decode (["10","20"]→List<int>); round-trip for every element type; malformed still throws; culture-invariance preserved. - MS SQL normalizer: old-form row → rewritten to native; native row → untouched (idempotency); malformed row → skipped + logged, other rows still processed; override row element-type resolved via template attribute.
- Site SQLite / InstanceActor: an old-form List override on load → re-persisted native (assert
SetStaticOverrideAsynccalled with native form); a native override → not re-persisted (idempotent); scalar overrides unaffected. - Bundle import: importing an old-form bundle lands native-form Values in the DB.
Out of scope / follow-ups
- Deployed-config snapshot is a fourth, un-normalized List-value store (latent gap, I-1).
DeployedConfigSnapshot.ConfigurationJson+RevisionHashfreeze the flattened config at deploy time. The staleness/diff path (DeploymentService.GetDeploymentComparisonAsync→DiffService.AttributesEqualordinal compare +RevisionHashServiceSHA over the rawValue) compares that frozen blob against a freshly-flattened (now native-form) config. If a List attribute was ever deployed in old-form, the snapshot stays old-form → a spurious "Changed" diff + false staleness flag until redeployed. This cannot fire against current data (no List attributes were ever deployed — see the Reality Note), so it is recorded as a known latent gap, not fixed. If hardening is wanted before List attributes are deployed at scale: route the deserialized snapshot's List values throughDecode→EncodeinGetDeploymentComparisonAsyncbefore the diff/hash (symmetric with the other normalizers). - CLI
template attributehelp still illustrates--valuewith a quoted string-list example; add a native-form numeric example (e.g.[10,20]) so users don't hand-author quoted numbers that get silently re-normalized. Doc-only; the quoted form still decodes. - Rewriting already-exported bundle files (unreachable).
- This pairs naturally with follow-up #93/M3 (populate
InstanceAttributeOverride.ElementDataTypeon write); if done, the override normalizer could read the column directly instead of joining to the template attribute. Not required here.