Apply Codex review findings across all 17 components

Template Engine: add composed member addressing (path-qualified canonical names),
override granularity per entity type, semantic validation (call targets, arg types),
graph acyclicity enforcement, revision hashes for flattened configs.

Deployment Manager: add deployment ID + idempotency, per-instance operation lock
covering all mutating commands, state transition matrix, site-side apply atomicity
(all-or-nothing), artifact version compatibility policy.

Site Runtime: add script trust model (forbidden APIs, execution timeout, constrained
compilation), concurrency/serialization rules (Instance Actor serializes mutations),
site-wide stream backpressure (per-subscriber buffering, fire-and-forget publish).

Communication: add application-level correlation IDs for protocol safety beyond
Akka.NET transport guarantees.

External System Gateway: add 408/429 as transient errors, CachedCall idempotency
note, dedicated dispatcher for blocking I/O isolation.

Health Monitoring: add monotonic sequence numbers to prevent stale report overwrites.

Security: require LDAPS/StartTLS for LDAP connections.

Central UI: add failover behavior (SignalR reconnect, JWT survives, shared Data
Protection keys, load balancer readiness).

Cluster Infrastructure: add down-if-alone=on for safe singleton ownership.

Site Event Logging: clarify active-node-only logging (no replication), add 1GB
storage cap with oldest-first purge.

Host: add readiness gating (health check endpoint, no traffic until operational).

Commons: add message contract versioning policy (additive-only evolution).

Configuration Database: add optimistic concurrency on deployment status records.
This commit is contained in:
Joseph Doherty
2026-03-16 09:06:12 -04:00
parent 70e5ae33d5
commit 34694adba2
13 changed files with 152 additions and 10 deletions

View File

@@ -94,9 +94,15 @@ Scripts choose between two call modes per invocation, mirroring the dual-mode da
- Each external system definition specifies a **timeout** that applies to all method calls on that system.
- Error classification by HTTP response:
- **Transient failures** (connection refused, timeout, HTTP 5xx): Behavior depends on call mode — `CachedCall` buffers for retry; `Call` returns error to script.
- **Permanent failures** (HTTP 4xx): Always returned to the calling script regardless of call mode. Logged to Site Event Logging.
- **Transient failures** (connection refused, timeout, HTTP 408, 429, 5xx): Behavior depends on call mode — `CachedCall` buffers for retry; `Call` returns error to script.
- **Permanent failures** (HTTP 4xx except 408/429): Always returned to the calling script regardless of call mode. Logged to Site Event Logging.
- This classification ensures the S&F buffer is not polluted with requests that will never succeed.
- **Idempotency note**: `CachedCall` retries may result in duplicate delivery if the external system received the original request but the response was lost. Callers should use `CachedCall` only for operations that are idempotent or where duplicate delivery is acceptable.
## Blocking I/O Isolation
- External system HTTP calls and database operations are blocking I/O. Script Execution Actors (which are short-lived, per-invocation actors) execute these calls, ensuring that blocking does not starve the parent Script Actor or Instance Actor.
- The Akka.NET actor system should configure a **dedicated dispatcher** for Script Execution Actors to isolate blocking I/O from the default dispatcher used by coordination actors.
## Database Connection Management