Merge M2: stillpending.md Tier-2 correctness & behavioral gaps (#7,#8,#9,#10,#13,#15,#17,#18,#20,#21,#22,#23,#24,#25,#26,#27,#28,#29,#30,#31,#32)
20 tasks (M2.0-M2.19), each through its classification-driven review chain. Full-solution build green (0 warnings, TreatWarningsAsErrors). Per-task targeted suites all passed. Known pre-existing: 2 partition-purge E2E failures (follow-up #52).
This commit is contained in:
@@ -36,28 +36,28 @@ public class ApiMethod
|
||||
public int Id { get; set; }
|
||||
public string Name { get; set; } // route segment
|
||||
public string Script { get; set; } // Roslyn C# script body
|
||||
public string? ParameterDefinitions { get; set; } // JSON: List<ParameterDefinition>
|
||||
public string? ReturnDefinition { get; set; } // JSON: List<ReturnFieldDefinition>
|
||||
public string? ParameterDefinitions { get; set; } // JSON Schema (object) describing parameters
|
||||
public string? ReturnDefinition { get; set; } // JSON Schema describing the return value
|
||||
public int TimeoutSeconds { get; set; }
|
||||
}
|
||||
```
|
||||
|
||||
`ParameterDefinitions` and `ReturnDefinition` are stored as JSON strings to keep the schema simple; both are deserialized on every request by `ParameterValidator` and `ReturnValueValidator`.
|
||||
`ParameterDefinitions` and `ReturnDefinition` are stored as JSON Schema strings (canonical form: `{"type":"object","properties":{…},"required":[…]}`, arrays via `"items"`); both are parsed on every request by `ParameterValidator` and `ReturnValueValidator` into a shared recursive `InboundApiSchema` (Commons). The legacy flat-array form (`[{name,type,required,itemType?}]`) is still accepted on read.
|
||||
|
||||
### Extended type system
|
||||
|
||||
Parameter and return field definitions share the same six-type vocabulary:
|
||||
Parameter and return definitions share the same six-type vocabulary (JSON Schema type tokens in parentheses):
|
||||
|
||||
| Type | JSON shape | C# value after coercion |
|
||||
|-----------|----------------------|-------------------------------------|
|
||||
| `Boolean` | `true` / `false` | `bool` |
|
||||
| `Integer` | number (whole) | `long` |
|
||||
| `Float` | number | `double` |
|
||||
| `String` | string | `string` |
|
||||
| `Object` | JSON object | `Dictionary<string, object?>` |
|
||||
| `List` | JSON array | `List<object?>` |
|
||||
| Type | JSON Schema token | JSON shape | C# value after coercion |
|
||||
|-----------|-------------------|------------------|-------------------------------|
|
||||
| `Boolean` | `boolean` | `true` / `false` | `bool` |
|
||||
| `Integer` | `integer` | number (whole) | `long` |
|
||||
| `Float` | `number` | number | `double` |
|
||||
| `String` | `string` | string | `string` |
|
||||
| `Object` | `object` | JSON object | `Dictionary<string, object?>` |
|
||||
| `List` | `array` | JSON array | `List<object?>` |
|
||||
|
||||
`Object` and `List` are validated for JSON shape only — field-level or element-level type constraints are the script's responsibility. Template attributes use only the four primitive types; the extended types apply here and in the External System Gateway.
|
||||
`Object` and `List` are validated **recursively**: a declared object validates each field against its declared (nested) type and rejects undeclared fields; a list validates every element against the declared `items` type. Scalars are checked at any depth and errors are path-qualified (e.g. `order.items[2].quantity`). A bare `{"type":"object"}` / `{"type":"array"}` (no `properties` / `items`) stays shape-only. Template attributes use only the four primitive types; the extended types apply here and in the External System Gateway.
|
||||
|
||||
## Architecture
|
||||
|
||||
|
||||
@@ -0,0 +1,203 @@
|
||||
# M2 — Correctness & Behavioral Gaps (Tier 2) Implementation Plan
|
||||
|
||||
> **For Claude:** REQUIRED SUB-SKILL: superpowers-extended-cc:subagent-driven-development. Execute task-by-task on branch `feature/stillpending-m2`, in-place (NOT a worktree — docker tooling builds from the repo path; implementers run **serially** to avoid racing the shared git index). Honor each task's `Classification` for the review chain.
|
||||
|
||||
**Goal:** Close the Tier-2 correctness/behavioral divergences from `stillpending.md` — make narrow/inert behaviors match the spec, and where the spec was the divergence, update it in the same slice.
|
||||
|
||||
**Architecture:** Touches the central Config DB (EF migrations), Site Runtime actors, the DCL alarm pipeline, the template validation/flattening pipeline, the deployment diff, Host startup validation, the Security cookie-auth pipeline, and Site Event Logging. Each item is independently shippable.
|
||||
|
||||
**Tech Stack:** C#/.NET 10, EF Core 10 (MS SQL central + SQLite site), Akka.NET 1.5, OPC UA (`OPCFoundation.NetStandard.Opc.Ua.Client`), ASP.NET Core cookie auth, xUnit/FluentAssertions/NSubstitute.
|
||||
|
||||
**Build/verify:** `dotnet build ZB.MOM.WW.ScadaBridge.slnx` (TreatWarningsAsErrors ON). Redeploy: `bash docker/deploy.sh`. Test user `--username multi-role --password password`.
|
||||
|
||||
---
|
||||
|
||||
## Scope decisions (recorded; per "use recommendations")
|
||||
|
||||
- **#19 (script started/completed events)** — already shipped in M1.8 (`e74c3ae`). **Excluded.**
|
||||
- **#16 (Transport stale-instance enumeration)** — genuine Tier-2 gap but NOT in the approved M2 list, and the fix needs a non-trivial shared-script-hash staleness compute across instances. **Deferred to the Transport milestone (M8).** Tracked, not dropped.
|
||||
- **#17 (MachineDataDb)** — a deliberate prior decision ("Host-008") removed this validation with a regression test asserting absence *passes*. The approved design doc says to add the option + startup validation, and both REQ-HOST-3/4 and the shipped docker `appsettings.Central.json` carry the key. **Resolution: implement per design doc (add option + central startup validation, no DbContext since nothing consumes it), reverting the Host-008 regression test and noting the reversal in the commit.**
|
||||
- **#31 (StateTransitionValidator delete-from-NotDeployed)** — the audit claimed a "deliberate per code comment"; investigation found **no such comment**. **Reconcile by intent (git blame); default = align code to the spec matrix (remove `NotDeployed` from `CanDelete`) unless blame shows deliberate orphan-cleanup intent, in which case update the doc matrix.**
|
||||
- **#8 (conditionFilter) semantics** — the filter is currently an undefined nullable string. **Define it as a comma-separated, case-insensitive list of alarm/condition *type names*; null/blank = mirror all.** Authoritative enforcement is **client-side in `DataConnectionActor` routing** (uniform across OPC UA + MxGateway, since MxGateway has no server-side filter); OPC UA additionally gets a server-side `WhereClause` as a bandwidth optimization where the type maps cleanly. Implementer confirms the discriminator field on `NativeAlarmTransition`.
|
||||
- **#15 (LDAP re-query)** — highest risk; passwordless group re-query depends on a shared-lib capability that may not exist. **Spike first**, then ship the always-achievable layers (idle-timeout enforcement + DB role-mapping refresh on stored group claims) and the LDAP group re-query only if the lib supports a service-account search; document any residual limitation.
|
||||
|
||||
---
|
||||
|
||||
## Execution order & dependencies
|
||||
|
||||
Risk-first, migration-safe ordering. `#32` first (unblocks DB-backed verification). The two migration-touching tasks (`#32`, M2.5) are serialized so the snapshot stays clean.
|
||||
|
||||
| # | Task | Class | Migration? |
|
||||
|---|------|-------|-----------|
|
||||
| #32 | M2.0 EF model/snapshot drift | high-risk | snapshot only |
|
||||
| M2.1 | #22 native-alarm capability validation wired into deploy pipeline | standard | no |
|
||||
| M2.2 | #10 connection-level diff surfaced | standard | no |
|
||||
| M2.3 | #7 `Database.CachedWrite` transient/permanent classification | high-risk | no |
|
||||
| M2.4 | #8 alarm `conditionFilter` applied | high-risk | no |
|
||||
| M2.5 | #9 per-script execution timeout | standard | **yes** (new column) |
|
||||
| M2.6 | #13 nested `Object`/`List` validation | standard | no |
|
||||
| M2.7 | #20 + #21 return-type + argument-type compatibility | standard | no |
|
||||
| M2.8 | #23 binding-completeness Error + "name exists at site" | standard | no |
|
||||
| M2.9 | #17 MachineDataDb fail-fast | small | no |
|
||||
| M2.10 | #18 CI grep-guard (data-layer scan test) | small | no |
|
||||
| M2.11 | #24 debug snapshot unknown-instance → error | small | no |
|
||||
| M2.12 | #25 recursion-limit → site event log | small | no |
|
||||
| M2.13 | #27 OPC UA / MxGateway transition field population | small | no |
|
||||
| M2.14 | #28 readiness "required singletons running" probe | standard | no |
|
||||
| M2.15 | #29 site active-node purge-gate DI registration | small | no |
|
||||
| M2.16 | #30 `FailedWriteCount` consumed by Health Monitoring | small | no |
|
||||
| M2.17 | #31 StateTransitionValidator reconcile | small | no |
|
||||
| M2.18 | #26 debug-stream ordering + replay/dedup | high-risk | no |
|
||||
| M2.19 | #15 LDAP periodic re-query (spike + impl) | high-risk | no |
|
||||
|
||||
---
|
||||
|
||||
## Tasks
|
||||
|
||||
### M2.0 — #32: EF model/snapshot drift (PendingModelChangesWarning)
|
||||
**Classification:** high-risk · **Files:** `src/ZB.MOM.WW.ScadaBridge.ConfigurationDatabase/Configurations/AuditLogEntityTypeConfiguration.cs:68-69`, `src/ZB.MOM.WW.ScadaBridge.ConfigurationDatabase/Migrations/ScadaBridgeDbContextModelSnapshot.cs`, possibly a new empty migration.
|
||||
**Root cause:** `OccurredAtUtc` has `.HasConversion(UtcConverter)` in config; the model snapshot omits the converter annotation → EF throws `PendingModelChangesWarning` in `MsSqlMigrationFixture.MigrateAsync` (~57 AuditLog MSSQL tests fail in fixture ctor).
|
||||
**Fix:** Run `dotnet ef migrations has-pending-model-changes` (or `migrations add`) against `ScadaBridgeDbContext` to surface the FULL drift (there may be more than `OccurredAtUtc`). Prefer the EF-canonical path: `dotnet ef migrations add ResyncAuditLogModelSnapshot` — **verify the generated migration's `Up`/`Down` are empty (no DDL)**; a value-converter-only change produces no DDL but realigns the snapshot. If non-empty/unexpected DDL appears, stop and report. Auto-apply is dev-only per CLAUDE.md, so an empty migration is harmless to prod.
|
||||
**Tests:** Re-run `dotnet test tests/ZB.MOM.WW.ScadaBridge.AuditLog.Tests` (requires MSSQL via `cd infra && docker compose up -d`); the ~57 fixture-ctor failures must clear. If MSSQL is unreachable in this environment, confirm the build + the snapshot diff is empty-DDL and note the test gating.
|
||||
**DoD:** No `PendingModelChangesWarning`; AuditLog MSSQL suite green (or gated-with-note if no DB). Adversarial: confirm no real schema change was smuggled in.
|
||||
|
||||
### M2.1 — #22: native-alarm-source capability validation wired into deploy pipeline
|
||||
**Classification:** standard · **Files:** `src/.../DeploymentManager/FlatteningPipeline.cs:93,115`, `SemanticValidator.cs:30-33,239-245`, validation service call site.
|
||||
**Gap (M1-era regression):** `FlatteningPipeline` loads `dataConnections` but never extracts the alarm-capable subset, so `SemanticValidator.Validate(...)` is always called with `alarmCapableConnectionNames = null` → native-alarm-source capability check never runs; a source can reference a non-alarm-capable connection and deploy.
|
||||
**Fix:** In `FlatteningPipeline`, compute the alarm-capable connection-name set from the loaded connections (filter by the protocol/capability that maps to `IAlarmSubscribableConnection` — OPC UA + MxGateway), pass it into the validator. Confirm the capability predicate (protocol enum / adapter capability) is the same one DCL uses to decide `IAlarmSubscribableConnection`.
|
||||
**Tests:** `tests/.../TemplateEngine.Tests` SemanticValidator/flattening — add: native-alarm source on a non-alarm-capable connection → validation Error; on a capable one → passes.
|
||||
**DoD:** Deploy gate rejects native-alarm sources bound to non-capable connections.
|
||||
|
||||
### M2.2 — #10: connection-level diff surfaced in deployment diff
|
||||
**Classification:** standard · **Files:** `src/.../Commons/Types/Flattening/ConfigurationDiff.cs:7-24`, `src/.../TemplateEngine/Flattening/DiffService.cs:19-54,174-204`, Central UI diff render (`CentralUI/Components/Shared/DiffDialog.razor` caller / deployment preview page).
|
||||
**Gap:** `ComputeConnectionsDiff` exists **with tests** but is dead (no callers); `ConfigurationDiff` has no `ConnectionChanges` slot; `HasChanges` ignores connections.
|
||||
**Fix:** Add `ConnectionChanges` slot (`IReadOnlyList<DiffEntry<ConnectionConfig>>` — the element type already exists) to `ConfigurationDiff`; include it in `HasChanges`. Call `ComputeConnectionsDiff` from `ComputeDiff` and populate the slot. Surface in the deployment-diff UI alongside attribute/alarm/script changes (connection name + old/new protocol + endpoint config). Wire the existing `ComputeConnectionsDiff` tests' expectations through `ComputeDiff` too.
|
||||
**Tests:** `tests/.../TemplateEngine.Tests/Flattening/DiffServiceTests.cs` — add a `ComputeDiff` integration assertion that `ConnectionChanges` populates and `HasChanges` is true when only a connection differs.
|
||||
**DoD:** Standalone connection endpoint/protocol/failover drift appears in the deployment diff.
|
||||
|
||||
### M2.3 — #7: `Database.CachedWrite` classifies transient vs permanent SQL errors
|
||||
**Classification:** high-risk · **Files:** `src/ZB.MOM.WW.ScadaBridge.ExternalSystemGateway/DatabaseGateway.cs:78-204`, new `SqlErrorClassifier.cs` + `PermanentDatabaseException`, reference `ExternalSystemClient.cs:80-162` + `ErrorClassifier.cs`.
|
||||
**Gap:** `CachedWriteAsync` buffers ALL writes without an immediate attempt; `DeliverBufferedAsync` throws on any `SqlException` → S&F retries permanent errors forever; the script never gets a synchronous `Failed`. The API path (`ExternalSystemClient`) does it right.
|
||||
**Fix (mirror API path):** Add `SqlErrorClassifier.IsTransient(SqlException)` — transient = connection/timeout/deadlock/throttle error numbers (e.g. `-2, 64, 53, 233, 1205, 40197, 40501, 40613, 49918-49920`); permanent = constraint/syntax/permission/etc. Create `PermanentDatabaseException` (parallel to `PermanentExternalSystemException`). In `CachedWriteAsync`: attempt immediately; on success done; on permanent → return `Failed` synchronously (set the tracking row terminal `Failed`) and do NOT buffer; on transient → buffer to S&F. In `DeliverBufferedAsync`: classify on `SqlException`, return `false` (park) for permanent, rethrow for transient (S&F retries). Keep behavior unified with `TrackedOperationId`/`OperationTrackingStore` and the `Pending → Retrying → Delivered/Parked/Failed/Discarded` lifecycle.
|
||||
**Tests:** `tests/.../ExternalSystemGateway.Tests/DatabaseGatewayTests.cs` — transient SQL (deadlock 1205, timeout -2) → buffers/retries; permanent SQL (constraint 2627, syntax 102, permission 229) → synchronous `Failed`, not buffered; `DeliverBuffered` parks on permanent. Adversarial: ambiguous error numbers default to the safer classification (document which).
|
||||
**DoD:** Permanent SQL errors fail fast to the script as `Failed`; only transient errors buffer.
|
||||
|
||||
### M2.4 — #8: alarm `conditionFilter` applied (OPC UA WhereClause + client-side routing)
|
||||
**Classification:** high-risk · **Files:** `src/.../DataConnectionLayer/Actors/DataConnectionActor.cs:1482,1540-1554`, `Adapters/RealOpcUaClient.cs:242,295-310`, `Adapters/MxGatewayDataConnection.cs:154-167`, `IAlarmSubscribableConnection.cs`.
|
||||
**Decision (semantics):** filter = comma-separated, case-insensitive list of alarm/condition **type names**; null/blank = mirror all. **Authoritative gate = client-side in `DataConnectionActor.HandleAlarmTransitionReceived`** (after source-ref match, drop transitions whose type name isn't in the source's filter set). Store the per-source filter set correctly (the current `_alarmSourceFilter[...]` keying is wrong — key by source reference). OPC UA additionally builds a server-side `WhereClause` in `RealOpcUaClient` as an optimization where the condition type maps cleanly; MxGateway relies solely on the client-side gate.
|
||||
**Fix:** (1) Parse the filter string into a normalized set at subscribe time, keyed by source ref. (2) In routing, consult the set and skip non-matching transitions. (3) In `RealOpcUaClient.BuildAlarmEventFilter`, attach a `WhereClause` (ContentFilter on the condition/event type) built from the filter when present. Confirm `NativeAlarmTransition` exposes a usable type-name discriminator; if not, filter on the available field and note it.
|
||||
**Tests:** `tests/.../DataConnectionLayer.Tests/DataConnectionActorAlarmTests.cs` — filter set → only matching-type transitions delivered; null → all delivered; MxGateway path filters client-side; OPC UA builds a non-empty WhereClause. Adversarial: case/whitespace variations in the filter list.
|
||||
**DoD:** Setting a conditionFilter actually restricts mirrored conditions across both adapters.
|
||||
|
||||
### M2.5 — #9: per-script execution timeout
|
||||
**Classification:** standard · **Migration: yes.** · **Files:** `Commons/Entities/Templates/TemplateScript.cs`, `ConfigurationDatabase/Configurations/TemplateConfiguration.cs` (`TemplateScriptConfiguration`), **new EF migration**, `Commons/Types/Flattening/FlattenedConfiguration.cs` (`ResolvedScript`), `TemplateEngine/Flattening/FlatteningService.cs` (`ResolveInheritedScripts`), `SiteRuntime/Actors/ScriptActor.cs`, `ScriptExecutionActor.cs:100`, `AlarmExecutionActor.cs:66`, `SiteRuntimeOptions.cs:31` (global fallback unchanged).
|
||||
**Gap:** Only a global `ScriptExecutionTimeoutSeconds`; no per-script field. Mirror the existing nullable `MinTimeBetweenRuns` pattern end-to-end.
|
||||
**Fix:** Add `int? ExecutionTimeoutSeconds` to `TemplateScript` + EF config (nullable) + **migration** (runs after M2.0 so the snapshot is clean) + `ResolvedScript` + flattening map + `ScriptActor` field; pass it into `ScriptExecutionActor`/`AlarmExecutionActor`, which compute `effective = perScript ?? options.ScriptExecutionTimeoutSeconds`. Validate non-negative.
|
||||
**Tests:** flattening test (field threads through), actor test (per-script override vs global default both enforce the CTS timeout), EF round-trip test.
|
||||
**DoD:** A per-script timeout overrides the global; absent → global default.
|
||||
|
||||
### M2.6 — #13: nested `Object`/`List` extended-type validation
|
||||
**Classification:** standard · **Files:** `src/.../InboundAPI/.../ParameterValidator.cs:109-145`, `ReturnValueValidator.cs:18`.
|
||||
**Gap:** `Object`/`List` are shape-validated only (object-vs-array); no nested/field-level type validation.
|
||||
**Fix:** Recursive descent through the declared `Object` field schema / `List` element type, type-checking each level (scalars by extended-type, nested Object/List recursively). Reuse the existing extended-type system; keep error messages path-qualified (`field.sub[2].x`). Apply symmetrically in both validators.
|
||||
**Tests:** `tests/.../InboundAPI.Tests` — valid nested payload passes; wrong scalar type at depth, wrong list element type, missing required nested field → rejected with path.
|
||||
**DoD:** Nested type mismatches are caught at inbound validation, not at script runtime. (Satisfies the M4 cross-reference to this item.)
|
||||
**Status: complete.** A shared recursive engine, `Commons.Types.InboundApi.InboundApiSchema` (parse + path-qualified `Validate`), backs both validators so they cannot drift. Key finding: the canonical persisted/authored format is **JSON Schema** (object `properties` + `required`, array `items`) — produced by the Central UI schema builder and the `MigrateParametersToJsonSchema` migration — but the validators still parsed the *legacy flat array* `[{name,type}]` and only shape-checked `Object`/`List`. They could not even consume a migrated JSON-Schema-object definition (the `Deserialize<List<…>>` would fail). Rewriting both to read `InboundApiSchema` fixes that latent format mismatch *and* delivers true nested validation; the legacy flat array is still accepted on read (case-insensitive keys) for transition safety. **Undeclared-field policy: reject at every level** (a declared object rejects any field not in its `properties`, consistent with the existing top-level `InboundAPI-010` "unexpected parameter" rejection); a bare `{"type":"object"}` with no declared fields stays shape-only. A present-but-null value satisfies any type; only the *absence* of a required field is an error.
|
||||
|
||||
### M2.7 — #20 + #21: return-type + argument-type compatibility checks
|
||||
**Classification:** standard · **Files:** `src/.../TemplateEngine/Validation/SemanticValidator.cs:62-63,251-266,279-287,390-425`.
|
||||
**Gap:** `BuildReturnMap` builds maps never read (no return-type comparison); call validation checks arg *count* only (comma counting), not arg *types*.
|
||||
**Fix:** #20 — compare a call site's used-return against the target script's declared `ReturnDefinition`; flag incompatible use. #21 — extract/infer argument types at the call site and check each against the parameter definition (count + type). These share `SemanticValidator` — implement together. Be conservative: only flag clear mismatches (avoid false positives on dynamically-typed expressions); document the inference limits.
|
||||
**Tests:** `tests/.../TemplateEngine.Tests` SemanticValidator — return-type mismatch flagged; arg type mismatch flagged; correct calls pass; dynamic/unknown types don't false-positive.
|
||||
**DoD:** Type-incompatible script calls fail validation, not just count-mismatched ones.
|
||||
|
||||
### M2.8 — #23: connection-binding completeness as deploy-gating Error + "name exists at site"
|
||||
**Classification:** standard · **Files:** `src/.../TemplateEngine/Validation/ValidationService.cs:504-519`, `ValidationResult.cs:9`.
|
||||
**Gap:** Missing-binding for a data-sourced attribute is a non-blocking Warning (so `IsValid` stays true); the "connection name exists at the target site" half is missing.
|
||||
**Fix:** Elevate binding-completeness to Error (or add a parallel Error-level check) so a deployment with unresolved bindings fails the gate; add the "binding references a connection that exists on the target site" check (resolve by site connection, not just name presence). Confirm this doesn't break legitimately-unbound attributes (static/non-data-sourced) — only data-sourced attributes require a binding.
|
||||
**Tests:** `tests/.../TemplateEngine.Tests` ValidationService — data-sourced attribute with no binding → Error + `IsValid` false; binding to a non-existent site connection → Error; static attribute without binding → OK.
|
||||
**DoD:** Incomplete/invalid connection bindings block deploy.
|
||||
|
||||
### M2.9 — #17: MachineDataDb fail-fast (per design doc; reverts Host-008)
|
||||
**Classification:** small · **Files:** `src/ZB.MOM.WW.ScadaBridge.Host/DatabaseOptions.cs:6-12`, `StartupValidator.cs:59-62`, `tests/.../Host.Tests/StartupValidatorTests.cs` (the `Central_MissingMachineDataDb_PassesValidation` regression).
|
||||
**Fix:** Add `string? MachineDataDb` to `DatabaseOptions`; add a Central-only `Require("ScadaBridge:Database:MachineDataDb", non-empty, ...)` in `StartupValidator`. **No DbContext** (nothing consumes it). Revert the Host-008 regression test to expect failure when missing; add `MachineDataDb` to `ValidCentralConfig()`. Commit message must note the deliberate Host-008 reversal and cite REQ-HOST-3/4 + shipped docker appsettings as justification.
|
||||
**Tests:** `StartupValidatorTests` — Central missing MachineDataDb → fails; present → passes; Site role unaffected.
|
||||
**DoD:** Central nodes fail fast on empty MachineDataDb; spec REQ-HOST-4 satisfied.
|
||||
|
||||
### M2.10 — #18: CI grep-guard against UPDATE/DELETE on AuditLog
|
||||
**Classification:** small · **Files:** new guard test in `tests/ZB.MOM.WW.ScadaBridge.ConfigurationDatabase.Tests/` (the only thing that actually runs — no CI service exists; build is Docker-only).
|
||||
**Fix:** Add a test that scans the ConfigurationDatabase source tree (and migration SQL) for `UPDATE`/`DELETE` statements targeting `AuditLog`, failing if any are found in C# data-access code. Scope strictly to the `AuditLog` table (allow purge/delete on Notifications/SiteCalls and partition-switch DDL). This backstops the existing DB-role `DENY UPDATE/DELETE` (migration `20260602174346`). Optionally add an MSBuild target mirroring it, but the test is the enforced control.
|
||||
**Tests:** the guard test itself; verify it passes on current clean source and would fail on a planted violation (assert via a unit on the scanner helper).
|
||||
**DoD:** A code-level guard fails the test run on AuditLog mutations.
|
||||
|
||||
### M2.11 — #24: debug snapshot/subscribe for unknown instance returns an error
|
||||
**Classification:** small · **Files:** `src/.../DeploymentManager/.../DeploymentManagerActor.cs:845-866`.
|
||||
**Gap:** Unknown-instance snapshot/subscribe returns an empty snapshot — caller can't distinguish "not deployed" from "deployed-but-empty".
|
||||
**Fix:** Check instance registration first; return an explicit "instance not found"/not-deployed error response (matching the existing debug response contract) instead of an empty snapshot.
|
||||
**Tests:** `tests/.../DeploymentManager` (or SiteRuntime) — unknown instance → error response; known empty instance → empty snapshot (unchanged).
|
||||
**DoD:** Unknown-instance debug requests are distinguishable from empty ones.
|
||||
|
||||
### M2.12 — #25: recursion-limit error → site event log
|
||||
**Classification:** small · **Files:** `src/.../SiteRuntime/.../ScriptRuntimeContext.cs:302-305,464-466` (thread `ISiteEventLogger` in, mirroring M1.8's `ScriptExecutionActor` wiring).
|
||||
**Fix:** Inject `ISiteEventLogger` into `ScriptRuntimeContext`; on recursion-limit violation, emit a `script` Error site event (fire-and-forget `_ = logger?.LogEventAsync(...)`) in addition to the existing `ILogger` log, at both check sites.
|
||||
**Tests:** `tests/.../SiteRuntime.Tests` — recursion-limit hit emits a site event with category `script`, severity Error.
|
||||
**DoD:** Recursion-limit violations appear in the site event log per spec.
|
||||
|
||||
### M2.13 — #27: populate obtainable OPC UA / MxGateway transition fields
|
||||
**Classification:** small · **Files:** `src/.../DataConnectionLayer/Adapters/RealOpcUaClient.cs:395-403`, `MxGatewayAlarmMapper.cs:79-113`.
|
||||
**Fix:** Populate fields that are genuinely obtainable: for OPC UA A&C, add SelectClauses + map Category, Description, OriginalRaiseTime where the server exposes them (extend `BuildAlarmEventFilter`'s SelectClauses); for MxGateway, extract `OperatorUser` (present in the event but dropped) and any available Current/Limit values. Leave truly-unavailable fields empty and document which are unavailable-by-protocol vs left-empty.
|
||||
**Tests:** `tests/.../DataConnectionLayer.Tests` mapper tests — obtainable fields populate from a representative event; unavailable fields documented.
|
||||
**DoD:** Display fields populate where the source provides them.
|
||||
|
||||
### M2.14 — #28: readiness gate checks required cluster singletons
|
||||
**Classification:** standard · **Files:** `src/.../Host/Program.cs:188-201,314-317`, new health check (peer to `AkkaClusterHealthCheck.cs`).
|
||||
**Gap:** Readiness covers membership + DB connectivity only; spec wants "required singletons running".
|
||||
**Fix:** Add a `Ready`-tagged health check that, on the active central node, verifies each required singleton proxy is reachable (e.g. `NotificationOutboxActor`, `AuditLogIngestActor`, `SiteCallAuditActor`, `AuditLogPurgeActor`, `SiteAuditReconciliationActor`) via a short `Ask`/Identify with timeout; degrade to Unhealthy if a required singleton is unreachable. Respect the "(if applicable)" softening — only gate on singletons that should be running for this node's role. Keep the probe cheap (cache/identify, short timeout) so readiness polling stays fast.
|
||||
**Tests:** `tests/.../Host.Tests` or IntegrationTests — health check reports Unhealthy when a required singleton proxy is absent; Healthy when present. Avoid flakiness (use Identify with a bounded timeout).
|
||||
**DoD:** `/health/ready` reflects singleton health.
|
||||
|
||||
### M2.15 — #29: register the site active-node purge gate
|
||||
**Classification:** small · **Files:** `src/.../SiteEventLogging/ServiceCollectionExtensions.cs:33-37`, site service registration / cluster setup.
|
||||
**Gap:** `SiteEventLogActiveNodeCheck` is consulted by `EventLogPurgeService` but no implementation is registered on the site node → purge runs on standby too (defaults to `() => true`).
|
||||
**Fix:** Register a `SiteEventLogActiveNodeCheck` delegate on the site node that returns true only when this node is the cluster leader/active (mirror how central gates active-node work). Keep the null-default behavior for non-clustered test hosts.
|
||||
**Tests:** `tests/.../SiteEventLogging.Tests` — purge gated off on standby, on for active; default-true preserved when unregistered.
|
||||
**DoD:** Site event-log purge runs only on the active node.
|
||||
|
||||
### M2.16 — #30: Health Monitoring consumes `FailedWriteCount`
|
||||
**Classification:** small · **Files:** `src/.../SiteEventLogging/ISiteEventLogger.cs:32-40`, Health Monitoring metric path.
|
||||
**Fix:** Wire `FailedWriteCount` into the site health metrics the same way other site metrics are collected/reported (find the existing site metric collection path), so the dangling metric is consumed (surface as a health metric / threshold). Keep it raw-count per the health-reporting conventions.
|
||||
**Tests:** `tests/.../HealthMonitoring`/SiteEventLogging — failed writes increment the reported metric.
|
||||
**DoD:** `FailedWriteCount` reaches Health Monitoring.
|
||||
|
||||
### M2.17 — #31: reconcile StateTransitionValidator delete-from-NotDeployed
|
||||
**Classification:** small · **Files:** `src/.../DeploymentManager/.../StateTransitionValidator.cs:38-41`, possibly `docs/requirements/Component-DeploymentManager.md` (spec matrix).
|
||||
**Fix:** `git blame`/log the `CanDelete` line to recover intent. Default: **align code to the spec matrix** — remove `NotDeployed` from the allowed delete states, add a clarifying comment — UNLESS history shows deliberate orphan-cleanup intent, in which case update the spec matrix (Delete from NotDeployed = Yes, with a no-op-cleanup note) instead. Whichever direction, code and doc must agree at the end.
|
||||
**Tests:** `tests/.../DeploymentManager` StateTransitionValidator — the chosen rule is asserted.
|
||||
**DoD:** Code and spec matrix agree on delete-from-NotDeployed.
|
||||
|
||||
### M2.18 — #26: debug-stream stream-first ordering + replay/dedup
|
||||
**Classification:** high-risk · **Files:** `src/.../DebugStreamBridgeActor.cs:89-103,163-166`.
|
||||
**Gap:** `PreStart` sends the snapshot first, then opens the gRPC stream → events in the gap window are lost. Spec wants stream-first + replay with timestamp dedup.
|
||||
**Fix:** Open the gRPC subscription FIRST (buffer incoming events), then fetch+send the snapshot, then flush buffered events, deduping by timestamp/identity against the snapshot so no gap-window event is lost or double-delivered. Preserve ordering. This is a re-arch of the actor's PreStart lifecycle — keep the existing message contract.
|
||||
**Tests:** `tests/.../` DebugStreamBridgeActor — an event arriving during the snapshot window is delivered exactly once after the snapshot; ordering preserved; dedup drops the snapshot-overlapping event.
|
||||
**DoD:** No gap-window events lost; no duplicates.
|
||||
|
||||
### M2.19 — #15: LDAP periodic re-query for interactive sessions (SECURITY)
|
||||
**Classification:** high-risk · **Files:** `src/.../Security/ServiceCollectionExtensions.cs:86-148` (cookie events), `JwtTokenService.cs` (wire the unused `IsIdleTimedOut`/`ShouldRefresh`/`RecordActivity`/`RefreshToken`), `RoleMapper.cs`, LDAP service interface, `CentralUI/Auth/AuthEndpoints.cs` (claims-build parity).
|
||||
**Spike first:** Determine whether the shared `ZB.MOM.WW.Auth.Ldap` lib exposes a **passwordless service-account group search** for an already-authenticated username. Report the answer before building the LDAP leg.
|
||||
**Fix (layered):**
|
||||
1. **Always achievable** — add `CookieAuthenticationEvents.OnValidatePrincipal` that: enforces idle-timeout (reject/sign-out past 30-min idle, advance last-activity on use), and refreshes role claims by **re-running `RoleMapper` on the stored group claims** (picks up central role-mapping changes without LDAP). Stamp a `LastLdapCheck` claim.
|
||||
2. **If the lib supports passwordless group search** — when `LastLdapCheck` is >15 min old, re-query LDAP groups via the service-account search, re-map roles, update role/site claims. **On LDAP failure: keep existing roles, do NOT sign out** (per "LDAP failure: new logins fail; active sessions continue with current roles"). If the lib does NOT support it, ship layer 1 and document the residual limitation (group-membership changes picked up only at next login) in the security doc.
|
||||
Rebuild claims identically to `/auth/login` (same claim types). Use the cookie-only model (embedded-JWT is dispositioned doc-only in M4).
|
||||
**Tests (incl. adversarial):** idle-timeout enforced; role-mapping change reflected without LDAP; LDAP-down on re-query keeps existing roles (no sign-out); >15-min triggers re-query, <15-min skips (TTL respected); a revoked-group user loses roles after re-query (if LDAP leg shipped).
|
||||
**DoD:** Interactive sessions enforce idle-timeout and refresh roles per the documented policy; any residual LDAP-dependency limitation is documented.
|
||||
|
||||
---
|
||||
|
||||
## Cross-cutting
|
||||
|
||||
- `dotnet build ZB.MOM.WW.ScadaBridge.slnx` green (TreatWarningsAsErrors); relevant unit/integration tests pass per task.
|
||||
- MSSQL-backed tests need `cd infra && docker compose up -d`; if unavailable, gate-with-note (M2.0 especially).
|
||||
- Migration tasks (M2.0, M2.5) serialized; M2.0 first.
|
||||
- `git diff` review before each commit; design-summary commit messages; one logical slice per commit.
|
||||
- After all tasks: final integration code review, build, and `bash docker/deploy.sh` smoke (`curl localhost:9000/health/ready`).
|
||||
@@ -0,0 +1,35 @@
|
||||
{
|
||||
"planPath": "docs/plans/2026-06-15-stillpending-m2-implementation.md",
|
||||
"tasks": [
|
||||
{"id": 32, "ref": "M2.0", "subject": "M2.0 #32: EF model/snapshot drift (PendingModelChangesWarning)", "class": "high-risk", "status": "completed", "commits": ["2fb608f"]},
|
||||
{"id": 33, "ref": "M2.1", "subject": "M2.1 #22: native-alarm capability validation wired into deploy pipeline", "class": "standard", "status": "completed", "commits": ["d690920", "41d828e"]},
|
||||
{"id": 34, "ref": "M2.2", "subject": "M2.2 #10: connection-level diff surfaced in deployment diff", "class": "standard", "status": "completed", "commits": ["e9a84ba", "198770f"]},
|
||||
{"id": 35, "ref": "M2.3", "subject": "M2.3 #7: Database.CachedWrite transient/permanent SQL classification", "class": "high-risk", "status": "completed", "commits": ["d052706", "de375ff"]},
|
||||
{"id": 36, "ref": "M2.4", "subject": "M2.4 #8: alarm conditionFilter applied (OPC UA WhereClause + client routing)", "class": "high-risk", "status": "completed", "commits": ["8825df5", "00304a2"]},
|
||||
{"id": 37, "ref": "M2.5", "subject": "M2.5 #9: per-script execution timeout (entity+migration+flatten+actor)", "class": "standard", "status": "completed", "blockedBy": [32], "commits": ["3edef09", "3032faa"]},
|
||||
{"id": 38, "ref": "M2.6", "subject": "M2.6 #13: nested Object/List extended-type validation", "class": "standard", "status": "completed", "commits": ["4b6187c", "411d0c0"]},
|
||||
{"id": 39, "ref": "M2.7", "subject": "M2.7 #20+#21: return-type + argument-type compatibility checks", "class": "standard", "status": "completed", "commits": ["958229e", "a8e9e99"]},
|
||||
{"id": 40, "ref": "M2.8", "subject": "M2.8 #23: binding-completeness Error + name-exists-at-site", "class": "standard", "status": "completed", "commits": ["7c14a69", "21b801b"]},
|
||||
{"id": 41, "ref": "M2.9", "subject": "M2.9 #17: MachineDataDb fail-fast (reverts Host-008)", "class": "small", "status": "completed", "commits": ["76198b3"]},
|
||||
{"id": 42, "ref": "M2.10", "subject": "M2.10 #18: CI grep-guard against UPDATE/DELETE on AuditLog", "class": "small", "status": "completed", "commits": ["e7b6fe3", "9cd62aa"]},
|
||||
{"id": 43, "ref": "M2.11", "subject": "M2.11 #24: debug snapshot unknown-instance returns error", "class": "small", "status": "completed", "commits": ["dbf44b9", "d160c7f"]},
|
||||
{"id": 44, "ref": "M2.12", "subject": "M2.12 #25: recursion-limit error to site event log", "class": "small", "status": "completed", "commits": ["f08038d", "e2b31a9"]},
|
||||
{"id": 45, "ref": "M2.13", "subject": "M2.13 #27: populate obtainable OPC UA/MxGateway transition fields", "class": "small", "status": "completed", "commits": ["722b866", "3945789"]},
|
||||
{"id": 46, "ref": "M2.14", "subject": "M2.14 #28: readiness gate checks required cluster singletons", "class": "standard", "status": "completed", "commits": ["253bec5", "6b1cb9e"]},
|
||||
{"id": 47, "ref": "M2.15", "subject": "M2.15 #29: register site active-node purge gate (DI)", "class": "small", "status": "completed", "commits": ["e1ee37e"]},
|
||||
{"id": 48, "ref": "M2.16", "subject": "M2.16 #30: Health Monitoring consumes FailedWriteCount", "class": "small", "status": "completed", "commits": ["d81f747", "c9244d8"]},
|
||||
{"id": 49, "ref": "M2.17", "subject": "M2.17 #31: reconcile StateTransitionValidator delete-from-NotDeployed", "class": "small", "status": "completed", "commits": ["c104356"]},
|
||||
{"id": 50, "ref": "M2.18", "subject": "M2.18 #26: debug-stream stream-first ordering + replay/dedup", "class": "high-risk", "status": "completed", "commits": ["d8519cb", "a0d9379"]},
|
||||
{"id": 51, "ref": "M2.19", "subject": "M2.19 #15: LDAP periodic re-query for interactive sessions (spike+impl)", "class": "high-risk", "status": "completed", "note": "Spike outcome: shared ILdapAuthService exposes only AuthenticateAsync (no passwordless group-search) -> live LDAP group re-query out of scope (external pkg, tracked follow-up). Implemented always-achievable layers: stored zb:group + zb:lastrolerefresh claims at login, shared SessionClaimBuilder (DRY login+refresh), CookieSessionValidator + OnValidatePrincipal (idle-timeout reject@30m, DB-only role-mapping refresh@15m, fail-soft keep-session on refresh error). Residual limitation documented in Component-Security.md.", "commits": ["8fe7f46", "fddc695"]}
|
||||
],
|
||||
"deferred": [
|
||||
{"ref": "#16", "subject": "Transport stale-instance enumeration", "to": "M8 (Transport)"},
|
||||
{"ref": "#19", "subject": "script started/completed events", "status": "done in M1.8"}
|
||||
],
|
||||
"followups": [
|
||||
{"id": 52, "subject": "Investigate 2 partition-purge E2E test failures (AuditLogPurgeActor/PartitionPurge)", "from": "M2.0", "status": "pending"},
|
||||
{"id": 53, "subject": "Dedup alarm-capable protocol predicate (3 copies → AlarmCapableProtocols)", "from": "M2.1", "status": "pending"},
|
||||
{"id": 54, "subject": "Expose ExecutionTimeoutSeconds (+ MinTimeBetweenRuns) in CLI + UI script authoring", "from": "M2.5", "status": "pending"}
|
||||
],
|
||||
"lastUpdated": "2026-06-15"
|
||||
}
|
||||
@@ -84,7 +84,14 @@ All mutating operations on a single instance (deploy, disable, enable, delete) s
|
||||
|---------------|--------|---------|--------|--------|
|
||||
| Enabled | Yes | Yes | No (already enabled) | Yes |
|
||||
| Disabled | Yes (enables on apply) | No (already disabled) | Yes | Yes |
|
||||
| Not deployed | Yes (initial deploy) | No | No | No |
|
||||
| Not deployed | Yes (initial deploy) | No | No | Yes (removes the orphan record) |
|
||||
|
||||
> **Delete from Not deployed:** permitted so an instance that was previously
|
||||
> undeployed (state `NotDeployed`) can have its record fully removed —
|
||||
> deployment history, snapshot, attribute/alarm overrides, and connection
|
||||
> bindings — rather than lingering as an unremovable orphan. There is no live
|
||||
> site configuration to tear down in this state, so the delete is a
|
||||
> central-side record cleanup (no site round-trip required).
|
||||
|
||||
## System-Wide Artifact Deployment Failure Handling
|
||||
|
||||
|
||||
@@ -95,6 +95,8 @@ On central nodes, the ASP.NET Core web endpoints (Central UI, Inbound API) must
|
||||
- Database connectivity (MS SQL) is verified.
|
||||
- Required cluster singletons are running (if applicable).
|
||||
|
||||
These are implemented as three `Ready`-tagged health checks registered in the Central-role branch of `Program.cs` (so they are naturally role-scoped — site nodes do not run them): `database` (`DatabaseHealthCheck<ScadaBridgeDbContext>`), `akka-cluster` (`AkkaClusterHealthCheck`), and `required-singletons` (`RequiredSingletonsHealthCheck`). The last verifies each *required-always* central singleton is reachable by Asking its local `ClusterSingletonProxy` an `Identify` with a short bounded timeout (~2s, probes run concurrently) and treating a non-null `ActorIdentity.Subject` as reachable; any unreachable required singleton degrades the check to **Unhealthy**, naming it. The required-always set is the five unconditional central singletons: notification-outbox, audit-log-ingest, site-call-audit, audit-log-purge, and site-audit-reconciliation. Feature-gated singletons are the "if applicable" case and are not probed when their feature is off. The check is leadership-agnostic — the proxy reaches the singleton from either central node, so a ready standby still reports ready (readiness must NOT require cluster leadership; that is the `Active` tier's job). During a brief singleton handover the probe may momentarily time out and the node may flap to not-ready, which is correct: a node mid-handover is legitimately not fully ready (no retries are used, to keep readiness polling fast).
|
||||
|
||||
A standard ASP.NET Core health check endpoint (`/health/ready`) reports readiness status. The load balancer uses this endpoint to determine when to route traffic to the node. During startup or failover, the node returns `503 Service Unavailable` until ready.
|
||||
|
||||
### REQ-HOST-5: Windows Service Hosting
|
||||
|
||||
@@ -40,9 +40,10 @@ Each API method definition includes:
|
||||
- **Approved API Keys**: List of API keys authorized to invoke this method. Requests from non-approved keys are rejected.
|
||||
- **Parameter Definitions**: Ordered list of input parameters, each with:
|
||||
- Parameter name.
|
||||
- Data type (Boolean, Integer, Float, String — same fixed set as template attributes).
|
||||
- Data type — the **extended type system** (Boolean, Integer, Float, String, plus the nestable Object and List; see [Extended Type System](#extended-type-system)).
|
||||
- Whether the parameter is required.
|
||||
- **Return Value Definition**: Structure of the response, with:
|
||||
- Field names and data types. Supports returning **lists of objects**.
|
||||
- Field names and (extended-system) data types. Supports returning **lists of objects** and arbitrarily nested structures.
|
||||
- **Implementation Script**: C# script that executes when the method is called. Stored **inline** in the method definition. Follows standard C# authoring patterns but has no template inheritance — it is a standalone script tied to this method.
|
||||
- **Timeout**: Configurable per method. Defines the maximum time the method is allowed to execute (including any routed calls to sites) before returning a timeout error to the caller.
|
||||
|
||||
@@ -99,6 +100,17 @@ Each API method definition includes:
|
||||
- This allows complex request/response structures (e.g., an object containing properties and a list of nested objects).
|
||||
- Template attributes retain the simpler four-type system. The extended types apply only to Inbound API method definitions and External System Gateway method definitions.
|
||||
|
||||
#### Type Definition Format & Nested Validation
|
||||
|
||||
- Parameter and return type definitions are persisted as **JSON Schema** (the canonical format produced by the Central UI schema builder; see the `MigrateParametersToJsonSchema` migration). An object declares its fields via `properties` (+ a `required` array); a list declares its element type via `items`. The legacy flat-array form (`[{name,type,required,itemType?}]`) is still accepted on read for transition safety.
|
||||
- Validation is **recursive and type-aware** for the extended types (request parameters and script return values alike, via a single shared engine so the two cannot drift):
|
||||
- **Object**: each declared field's value is validated against its declared (possibly nested) type; a missing required field and a present-but-wrong type are both reported.
|
||||
- **List**: every element is validated against the declared element type (recursing into nested objects/lists). A list whose element type is left undeclared (`array` without `items`) is shape-checked only.
|
||||
- **Scalars at any depth** are checked against the extended type.
|
||||
- Errors are **path-qualified** (e.g. `order.items[2].quantity`) so the caller can locate the offending field.
|
||||
- **Undeclared fields are rejected** at every level (consistent with the top-level "unexpected parameter" rejection): an object that declares its fields rejects any field not in its `properties`, so a typo'd field name surfaces as a `400`/error rather than being silently ignored. A bare object schema with no declared fields (`{"type":"object"}`) stays shape-only and accepts any fields.
|
||||
- A JSON `null` value satisfies any declared type (a present-but-null field is allowed); only the **absence** of a required field is an error.
|
||||
|
||||
## Script Compilation & Hot-Reload
|
||||
|
||||
API method scripts are compiled at central startup — all method definitions are loaded from the configuration database and compiled into in-memory delegates.
|
||||
|
||||
@@ -32,9 +32,31 @@ Central cluster. Sites do not have user-facing interfaces and do not perform ind
|
||||
- **JWT claims**: User display name, username, list of roles (Admin, Design, Deployment), and for site-scoped Deployment, the list of permitted site IDs.
|
||||
|
||||
### Token Lifecycle
|
||||
- **JWT expiry**: 15 minutes. On each request, if the cookie-embedded JWT is near expiry, the app re-queries LDAP for current group memberships and issues a fresh JWT, writing an updated cookie. Roles are never more than 15 minutes stale.
|
||||
- **Idle timeout**: Configurable, default **30 minutes**. If no requests are made within the idle window, the token is not refreshed and the user must re-login. Tracked via a last-activity timestamp in the token.
|
||||
- **Sliding refresh**: Active users stay logged in indefinitely — the token refreshes every 15 minutes as long as requests are made within the 30-minute idle window.
|
||||
|
||||
> **Implementation note (M2.19, #15).** The interactive Central UI login path signs in
|
||||
> with **bare cookie claims**, not a cookie-embedded JWT. The session lifecycle below is
|
||||
> therefore enforced by the cookie middleware (`ExpireTimeSpan` + `SlidingExpiration`) plus
|
||||
> a `CookieAuthenticationEvents.OnValidatePrincipal` handler — see **Session Validation
|
||||
> (`OnValidatePrincipal`)** below. The embedded-JWT model remains the documented design
|
||||
> intent and is the mechanism for any non-cookie bearer surface (e.g. `/auth/token`), but
|
||||
> it is **not** the transport for the cookie principal.
|
||||
|
||||
- **Idle timeout**: Configurable, default **30 minutes**. If no requests are made within the idle window, the session is rejected and the user must re-login. Tracked via a `LastActivity` last-activity timestamp claim. The cookie's `ExpireTimeSpan` is set to the idle timeout and `SlidingExpiration` renews it on activity, so the cookie window and the explicit `OnValidatePrincipal` idle check use the **same** value and cannot contradict each other.
|
||||
- **Role-mapping refresh (LDAP-free)**: Configurable, default **15 minutes** (`SecurityOptions.RoleRefreshThresholdMinutes`). At login the session stores the user's raw LDAP groups (one `zb:group` claim each) plus a `zb:lastrolerefresh` anchor. Once the anchor is older than the threshold, `OnValidatePrincipal` re-runs the **DB-backed** `RoleMapper` on the stored groups — **with no LDAP call** — rebuilds the role/scope claims via the shared claim-builder, advances the anchor, and re-issues the cookie. Central role-mapping (DB) changes — including a **revoked** mapping that drops the user's roles, and changed site-scope rules — take effect within this window. Roles derived from central mappings are never more than ~15 minutes stale.
|
||||
|
||||
#### Session Validation (`OnValidatePrincipal`)
|
||||
- The cookie principal is built at login by a **single shared claim-builder** (`SessionClaimBuilder`). The `OnValidatePrincipal` role-refresh path rebuilds the principal through the **same** builder, so the login and refresh claim shapes cannot drift.
|
||||
- **Failure policy**: the refresh is best-effort. Any error during the refresh (e.g. the configuration database is unreachable) **keeps the existing principal with its current roles** — it never signs the user out and never throws out of the request pipeline. This mirrors the **Active sessions** stance under *LDAP Connection Failure* below. Only the explicit idle-timeout path rejects the principal.
|
||||
|
||||
> **Residual limitation — live LDAP group-membership changes (follow-up).** The
|
||||
> mid-session refresh re-maps the **stored** groups against the central database; it does
|
||||
> **not** re-query LDAP, so a change to the user's actual **group membership** in the
|
||||
> directory is picked up only at **next login**. A live group re-query for an active
|
||||
> session would require a new passwordless service-account group-search method on the
|
||||
> shared `ZB.MOM.WW.Auth.Ldap` library, which is an **external NuGet package** and exposes
|
||||
> only `AuthenticateAsync(username, password, ct)` (no standalone group search). Adding
|
||||
> that method is tracked as a follow-up. Until then: central role-mapping/scope changes are
|
||||
> reflected within ~15 minutes; directory group-membership changes require re-login.
|
||||
|
||||
### Load Balancer Compatibility
|
||||
- The authentication cookie carries a self-contained JWT — no server-side session state. A load balancer in front of the central cluster can route requests to either node without sticky sessions or a shared session store.
|
||||
@@ -43,8 +65,8 @@ Central cluster. Sites do not have user-facing interfaces and do not perform ind
|
||||
## LDAP Connection Failure
|
||||
|
||||
- **New logins**: If the LDAP/AD server is unreachable, login attempts **fail**. Users cannot be authenticated without LDAP.
|
||||
- **Active sessions**: Users with valid (not-yet-expired) JWTs can **continue operating** with their current roles. The token refresh is skipped until LDAP is available again. This avoids disrupting engineers mid-work during a brief LDAP outage.
|
||||
- **Recovery**: When LDAP becomes reachable again, the next token refresh cycle re-queries group memberships and issues a fresh token with current roles.
|
||||
- **Active sessions**: Users with a valid (not-idle-timed-out) session can **continue operating** with their current roles during an LDAP outage. Interactive cookie sessions never re-query LDAP mid-session (the mid-session role-mapping refresh is DB-only — see *Session Validation* above), so a brief LDAP outage does not disrupt engineers mid-work; central role-mapping changes still apply within the refresh window regardless of LDAP availability.
|
||||
- **Recovery (group-membership changes)**: Because the mid-session refresh is LDAP-free, a change to a user's **directory group membership** is picked up at the user's **next login** (when LDAP is queried again), not mid-session — see the *Residual limitation* note above.
|
||||
|
||||
## Roles
|
||||
|
||||
|
||||
@@ -1,4 +1,3 @@
|
||||
using System.Security.Claims;
|
||||
using Microsoft.AspNetCore.Authentication;
|
||||
using Microsoft.AspNetCore.Authentication.Cookies;
|
||||
using Microsoft.AspNetCore.Builder;
|
||||
@@ -35,7 +34,6 @@ public static class AuthEndpoints
|
||||
}
|
||||
|
||||
var ldapAuth = context.RequestServices.GetRequiredService<ILdapAuthService>();
|
||||
var jwtService = context.RequestServices.GetRequiredService<JwtTokenService>();
|
||||
var roleMapper = context.RequestServices.GetRequiredService<IGroupRoleMapper<string>>();
|
||||
|
||||
var authResult = await ldapAuth.AuthenticateAsync(username, password, context.RequestAborted);
|
||||
@@ -72,39 +70,23 @@ public static class AuthEndpoints
|
||||
// the documented sliding-refresh policy.
|
||||
var displayName = string.IsNullOrEmpty(authResult.DisplayName) ? username : authResult.DisplayName;
|
||||
var resolvedUsername = string.IsNullOrEmpty(authResult.Username) ? username : authResult.Username;
|
||||
var claims = new List<Claim>
|
||||
{
|
||||
new(ClaimTypes.Name, resolvedUsername),
|
||||
new(JwtTokenService.DisplayNameClaimType, displayName),
|
||||
new(JwtTokenService.UsernameClaimType, resolvedUsername),
|
||||
};
|
||||
|
||||
foreach (var role in roleMapping.Roles)
|
||||
{
|
||||
claims.Add(new Claim(JwtTokenService.RoleClaimType, role));
|
||||
}
|
||||
|
||||
if (!scope.IsSystemWideDeployment)
|
||||
{
|
||||
foreach (var siteId in scope.PermittedSiteIds)
|
||||
{
|
||||
claims.Add(new Claim(JwtTokenService.SiteIdClaimType, siteId));
|
||||
}
|
||||
}
|
||||
|
||||
// Task 1.5: name the role/name claim types explicitly so the cookie
|
||||
// principal's IsInRole / [Authorize(Roles=…)] resolve against the same
|
||||
// canonical types we mint (JwtTokenService.RoleClaimType = ZbClaimTypes.Role,
|
||||
// ClaimTypes.Name = ZbClaimTypes.Name). The policies use
|
||||
// RequireClaim(RoleClaimType, …) which checks type+value directly, but
|
||||
// pinning roleType keeps IsInRole-style checks consistent and survives the
|
||||
// cookie serialize/round-trip.
|
||||
var identity = new ClaimsIdentity(
|
||||
claims,
|
||||
authenticationType: CookieAuthenticationDefaults.AuthenticationScheme,
|
||||
nameType: ClaimTypes.Name,
|
||||
roleType: JwtTokenService.RoleClaimType);
|
||||
var principal = new ClaimsPrincipal(identity);
|
||||
// M2.19 (#15): build the cookie principal through the shared
|
||||
// SessionClaimBuilder — the SINGLE source of truth that the mid-session
|
||||
// OnValidatePrincipal role-refresh path ALSO uses, so login and refresh can
|
||||
// never drift. It stamps the canonical identity/role/scope claims (with
|
||||
// roleType/nameType pinned for IsInRole), PLUS the M2.19 additions: one
|
||||
// zb:group claim per raw LDAP group (the durable input the mid-session
|
||||
// RoleMapper re-run consumes) and a zb:lastrolerefresh anchor (login time,
|
||||
// UTC) that also seeds the LastActivity idle anchor. The refresh timestamp is
|
||||
// the login instant, so the first role refresh is due RoleRefreshThresholdMinutes
|
||||
// later — not immediately.
|
||||
var principal = SessionClaimBuilder.Build(
|
||||
resolvedUsername,
|
||||
displayName,
|
||||
authResult.Groups,
|
||||
scope,
|
||||
DateTimeOffset.UtcNow);
|
||||
|
||||
await context.SignInAsync(
|
||||
CookieAuthenticationDefaults.AuthenticationScheme,
|
||||
|
||||
@@ -445,6 +445,17 @@
|
||||
});
|
||||
});
|
||||
|
||||
// M2.11: the site returns InstanceNotFound=true when the instance is
|
||||
// not deployed there (e.g. deployment not yet pushed, or wrong site).
|
||||
if (session.InitialSnapshot.InstanceNotFound)
|
||||
{
|
||||
DebugStreamService.StopStream(session.SessionId);
|
||||
_toast.ShowError(
|
||||
"Instance not found on the selected site — check the deployment target.");
|
||||
_connecting = false;
|
||||
return;
|
||||
}
|
||||
|
||||
_session = session;
|
||||
|
||||
// Populate initial state from snapshot
|
||||
|
||||
@@ -864,12 +864,144 @@
|
||||
? "The deployed revision hash differs from the current template-derived hash. Redeploy to apply changes."
|
||||
: "No differences between deployed and current configuration.");
|
||||
builder.CloseElement();
|
||||
|
||||
// DeploymentManager-018: render the structured diff sections so
|
||||
// the operator sees WHAT changed, not just that the hash moved.
|
||||
// Each section uses the same compact change-table idiom; the
|
||||
// connection section surfaces standalone endpoint/protocol/
|
||||
// failover drift that no per-attribute row would show (#10).
|
||||
var d = diffResult.Diff;
|
||||
if (d != null)
|
||||
{
|
||||
RenderChangeSection(builder, 100_000, "Attributes", d.AttributeChanges,
|
||||
a => a.Value ?? "—");
|
||||
|
||||
RenderChangeSection(builder, 200_000, "Alarms", d.AlarmChanges,
|
||||
a => $"P{a.PriorityLevel} · {a.TriggerType}");
|
||||
|
||||
RenderChangeSection(builder, 300_000, "Scripts", d.ScriptChanges,
|
||||
s => s.TriggerType ?? "—");
|
||||
|
||||
RenderChangeSection(builder, 400_000, "Connections", d.ConnectionChanges,
|
||||
c => FormatConnection(c));
|
||||
}
|
||||
}
|
||||
};
|
||||
|
||||
await _diffDialog.ShowAsync($"Deployment Diff — {inst.UniqueName}", body);
|
||||
}
|
||||
|
||||
// Compact summary of a connection's deployment-relevant fields for the diff
|
||||
// table's Before/After cells. Surfaces all four fields ConnectionsEqual
|
||||
// compares — protocol, primary endpoint config, failover retry count, and
|
||||
// the backup endpoint — so a backup-only change doesn't show identical
|
||||
// Before/After cells. The backup segment is omitted when there is no backup.
|
||||
private static string FormatConnection(
|
||||
ZB.MOM.WW.ScadaBridge.Commons.Types.Flattening.ConnectionConfig c)
|
||||
{
|
||||
var endpoint = string.IsNullOrWhiteSpace(c.ConfigurationJson) ? "—" : c.ConfigurationJson;
|
||||
var summary = $"{c.Protocol} · {endpoint} · failover ×{c.FailoverRetryCount}";
|
||||
if (!string.IsNullOrWhiteSpace(c.BackupConfigurationJson))
|
||||
{
|
||||
summary += $" · backup {c.BackupConfigurationJson}";
|
||||
}
|
||||
return summary;
|
||||
}
|
||||
|
||||
// Renders one change section (a heading plus a Bootstrap change-table) for a
|
||||
// set of diff entries, matching the deployment-diff idiom used elsewhere in
|
||||
// the UI: table-sm/table-striped, a colored change badge, and Before/After
|
||||
// text columns. Nothing is rendered when the section has no entries, so the
|
||||
// four sections (attributes, alarms, scripts, connections) all read the same
|
||||
// and only appear when they actually changed. seqBase values are spaced
|
||||
// 100k apart so each section's per-row sequence numbers (13 per row) stay in
|
||||
// a disjoint, ascending range no matter how many entries a section has.
|
||||
private static void RenderChangeSection<T>(
|
||||
Microsoft.AspNetCore.Components.Rendering.RenderTreeBuilder builder,
|
||||
int seqBase,
|
||||
string heading,
|
||||
IReadOnlyList<ZB.MOM.WW.ScadaBridge.Commons.Types.Flattening.DiffEntry<T>> entries,
|
||||
Func<T, string> summarize)
|
||||
{
|
||||
if (entries.Count == 0)
|
||||
return;
|
||||
|
||||
builder.OpenElement(seqBase, "div");
|
||||
builder.AddAttribute(seqBase + 1, "class", "mt-3");
|
||||
|
||||
builder.OpenElement(seqBase + 2, "div");
|
||||
builder.AddAttribute(seqBase + 3, "class", "fw-semibold small mb-1");
|
||||
builder.AddContent(seqBase + 4, $"{heading} ({entries.Count})");
|
||||
builder.CloseElement();
|
||||
|
||||
builder.OpenElement(seqBase + 5, "table");
|
||||
builder.AddAttribute(seqBase + 6, "class", "table table-sm table-striped align-middle mb-0");
|
||||
|
||||
// Header row.
|
||||
builder.OpenElement(seqBase + 7, "thead");
|
||||
builder.OpenElement(seqBase + 8, "tr");
|
||||
AppendHeaderCell(builder, seqBase + 9, "Name");
|
||||
AppendHeaderCell(builder, seqBase + 12, "Change");
|
||||
AppendHeaderCell(builder, seqBase + 15, "Before");
|
||||
AppendHeaderCell(builder, seqBase + 18, "After");
|
||||
builder.CloseElement(); // tr
|
||||
builder.CloseElement(); // thead
|
||||
|
||||
builder.OpenElement(seqBase + 21, "tbody");
|
||||
var rowSeq = seqBase + 22;
|
||||
foreach (var entry in entries)
|
||||
{
|
||||
builder.OpenElement(rowSeq, "tr");
|
||||
|
||||
builder.OpenElement(rowSeq + 1, "td");
|
||||
builder.AddContent(rowSeq + 2, entry.CanonicalName);
|
||||
builder.CloseElement();
|
||||
|
||||
builder.OpenElement(rowSeq + 3, "td");
|
||||
builder.OpenElement(rowSeq + 4, "span");
|
||||
builder.AddAttribute(rowSeq + 5, "class", ChangeBadgeClass(entry.ChangeType));
|
||||
builder.AddContent(rowSeq + 6, entry.ChangeType.ToString());
|
||||
builder.CloseElement();
|
||||
builder.CloseElement();
|
||||
|
||||
builder.OpenElement(rowSeq + 7, "td");
|
||||
builder.AddAttribute(rowSeq + 8, "class", "small text-muted");
|
||||
builder.AddContent(rowSeq + 9,
|
||||
entry.OldValue is null ? "—" : summarize(entry.OldValue));
|
||||
builder.CloseElement();
|
||||
|
||||
builder.OpenElement(rowSeq + 10, "td");
|
||||
builder.AddAttribute(rowSeq + 11, "class", "small");
|
||||
builder.AddContent(rowSeq + 12,
|
||||
entry.NewValue is null ? "—" : summarize(entry.NewValue));
|
||||
builder.CloseElement();
|
||||
|
||||
builder.CloseElement(); // tr
|
||||
rowSeq += 13;
|
||||
}
|
||||
builder.CloseElement(); // tbody
|
||||
|
||||
builder.CloseElement(); // table
|
||||
builder.CloseElement(); // div.mt-3
|
||||
}
|
||||
|
||||
private static void AppendHeaderCell(
|
||||
Microsoft.AspNetCore.Components.Rendering.RenderTreeBuilder builder, int seq, string text)
|
||||
{
|
||||
builder.OpenElement(seq, "th");
|
||||
builder.AddAttribute(seq + 1, "scope", "col");
|
||||
builder.AddContent(seq + 2, text);
|
||||
builder.CloseElement();
|
||||
}
|
||||
|
||||
private static string ChangeBadgeClass(
|
||||
ZB.MOM.WW.ScadaBridge.Commons.Types.Flattening.DiffChangeType changeType) => changeType switch
|
||||
{
|
||||
ZB.MOM.WW.ScadaBridge.Commons.Types.Flattening.DiffChangeType.Added => "badge bg-success",
|
||||
ZB.MOM.WW.ScadaBridge.Commons.Types.Flattening.DiffChangeType.Removed => "badge bg-danger",
|
||||
_ => "badge bg-warning text-dark",
|
||||
};
|
||||
|
||||
// ---- Dropdown option helpers ----
|
||||
private IEnumerable<(int Id, string Label)> EnumerateSiteOptions()
|
||||
{
|
||||
|
||||
@@ -117,6 +117,9 @@
|
||||
private string? _scriptParameters;
|
||||
private string? _scriptReturn;
|
||||
private bool _scriptIsLocked;
|
||||
// Round-tripped from the loaded script so UI edits preserve a timeout set
|
||||
// via Transport import (no authoring control in the UI — scoped out).
|
||||
private int? _scriptExecutionTimeoutSeconds;
|
||||
private string? _scriptFormError;
|
||||
private string _scriptModalTab = "trigger"; // "trigger" | "code" | "parameters" | "return"
|
||||
private MonacoEditor? _scriptEditor;
|
||||
@@ -1797,6 +1800,7 @@
|
||||
_scriptParameters = null;
|
||||
_scriptReturn = null;
|
||||
_scriptIsLocked = false;
|
||||
_scriptExecutionTimeoutSeconds = null;
|
||||
_scriptModalTab = "trigger";
|
||||
ResetScriptTestRun();
|
||||
}
|
||||
@@ -1814,6 +1818,9 @@
|
||||
_scriptParameters = script.ParameterDefinitions;
|
||||
_scriptReturn = script.ReturnDefinition;
|
||||
_scriptIsLocked = script.IsLocked;
|
||||
// Preserve any timeout set via Transport import — the UI has no authoring
|
||||
// control for this field, so we round-trip the loaded value unchanged.
|
||||
_scriptExecutionTimeoutSeconds = script.ExecutionTimeoutSeconds;
|
||||
_scriptModalTab = "trigger";
|
||||
ResetScriptTestRun();
|
||||
}
|
||||
@@ -1907,6 +1914,9 @@
|
||||
ReturnDefinition = _scriptReturn,
|
||||
IsLocked = _scriptIsLocked,
|
||||
MinTimeBetweenRuns = DurationInput.Compose(_scriptMinTimeValue, _scriptMinTimeUnit),
|
||||
// Round-trip the loaded value — no UI control, so preserve
|
||||
// any timeout set via Transport import unchanged.
|
||||
ExecutionTimeoutSeconds = _scriptExecutionTimeoutSeconds,
|
||||
IsInherited = existing.IsInherited,
|
||||
LockedInDerived = existing.LockedInDerived,
|
||||
};
|
||||
|
||||
@@ -52,6 +52,15 @@ public class TemplateScript
|
||||
/// </summary>
|
||||
public TimeSpan? MinTimeBetweenRuns { get; set; }
|
||||
|
||||
/// <summary>
|
||||
/// Per-script execution timeout in seconds, or null to use the site's global
|
||||
/// default (<c>SiteRuntimeOptions.ScriptExecutionTimeoutSeconds</c>). A
|
||||
/// non-positive value (≤ 0) is treated the same as null — i.e. fall back to
|
||||
/// the global default — by the Site Runtime. Seconds (not a TimeSpan) to keep
|
||||
/// the unit consistent with the global option it overrides.
|
||||
/// </summary>
|
||||
public int? ExecutionTimeoutSeconds { get; set; }
|
||||
|
||||
/// <summary>
|
||||
/// True when this row was copied from the base template and has not been
|
||||
/// overridden on the derived template. Changes to the base flow downward
|
||||
|
||||
@@ -0,0 +1,34 @@
|
||||
namespace ZB.MOM.WW.ScadaBridge.Commons.Interfaces.Protocol;
|
||||
|
||||
/// <summary>
|
||||
/// Single source of truth for which data-connection protocol strings produce an
|
||||
/// adapter that implements <see cref="IAlarmSubscribableConnection"/> (i.e. can
|
||||
/// mirror native alarms).
|
||||
///
|
||||
/// The set MUST stay in sync with the protocols registered against an
|
||||
/// alarm-subscribable adapter in
|
||||
/// <c>DataConnectionLayer/DataConnectionFactory.cs</c>: today the "OpcUa" adapter
|
||||
/// (<c>OpcUaDataConnection</c>) and the "MxGateway" adapter
|
||||
/// (<c>MxGatewayDataConnection</c>) both implement
|
||||
/// <see cref="IAlarmSubscribableConnection"/>. The runtime decision is made in
|
||||
/// <c>DataConnectionActor</c> via <c>_adapter is IAlarmSubscribableConnection</c>;
|
||||
/// this central-side helper lets the deploy pipeline and Central UI gate
|
||||
/// native-alarm-source bindings against the same notion without instantiating an
|
||||
/// adapter. Adding a new alarm-capable protocol = register the adapter in the
|
||||
/// factory AND add its protocol string here.
|
||||
/// </summary>
|
||||
public static class AlarmCapableProtocols
|
||||
{
|
||||
/// <summary>
|
||||
/// Determines whether a data connection's protocol string resolves to an
|
||||
/// alarm-capable adapter (one implementing <see cref="IAlarmSubscribableConnection"/>).
|
||||
/// Case-insensitive to match <c>DataConnectionFactory</c>'s own
|
||||
/// <c>OrdinalIgnoreCase</c> protocol-key lookup; <c>null</c>/blank is not
|
||||
/// alarm-capable.
|
||||
/// </summary>
|
||||
/// <param name="protocol">The data connection protocol string (e.g. "OpcUa").</param>
|
||||
/// <returns><c>true</c> when the protocol's adapter can subscribe native alarms; otherwise <c>false</c>.</returns>
|
||||
public static bool IsAlarmCapable(string? protocol) =>
|
||||
string.Equals(protocol, "OpcUa", StringComparison.OrdinalIgnoreCase)
|
||||
|| string.Equals(protocol, "MxGateway", StringComparison.OrdinalIgnoreCase);
|
||||
}
|
||||
@@ -56,8 +56,17 @@ public interface IDatabaseGateway
|
||||
/// <param name="parameters">Optional SQL parameters for the statement.</param>
|
||||
/// <param name="originInstanceName">Optional name of the instance that originated the write.</param>
|
||||
/// <param name="cancellationToken">Cancellation token for the buffering operation.</param>
|
||||
/// <returns>A task that represents the asynchronous operation.</returns>
|
||||
Task CachedWriteAsync(
|
||||
/// <returns>
|
||||
/// M2.3 (#7): an <see cref="ExternalCallResult"/> mirroring the External-System
|
||||
/// API path (<c>IExternalSystemClient.CachedCallAsync</c>). The write is
|
||||
/// attempted immediately:
|
||||
/// <list type="bullet">
|
||||
/// <item>immediate success → <c>Success=true, WasBuffered=false</c> (not buffered);</item>
|
||||
/// <item>permanent SQL error (constraint / syntax / permission) → <c>Success=false, WasBuffered=false</c> with an error message, returned synchronously and NOT buffered;</item>
|
||||
/// <item>transient SQL error (connection / timeout / deadlock / throttle) → buffered to store-and-forward, <c>Success=true, WasBuffered=true</c>.</item>
|
||||
/// </list>
|
||||
/// </returns>
|
||||
Task<ExternalCallResult> CachedWriteAsync(
|
||||
string connectionName,
|
||||
string sql,
|
||||
IReadOnlyDictionary<string, object?>? parameters = null,
|
||||
|
||||
@@ -2,8 +2,38 @@ using ZB.MOM.WW.ScadaBridge.Commons.Messages.Streaming;
|
||||
|
||||
namespace ZB.MOM.WW.ScadaBridge.Commons.Messages.DebugView;
|
||||
|
||||
/// <summary>
|
||||
/// Snapshot of an instance's debug state returned in response to a
|
||||
/// <see cref="DebugSnapshotRequest"/> or <see cref="SubscribeDebugViewRequest"/>.
|
||||
/// </summary>
|
||||
/// <remarks>
|
||||
/// <para>
|
||||
/// <b>Additive-only contract (M2.11):</b> <see cref="InstanceNotFound"/> is an
|
||||
/// optional trailing parameter with a default of <see langword="false"/> so every
|
||||
/// existing positional constructor call and every existing serialized wire frame
|
||||
/// remains valid. Callers that receive a snapshot with
|
||||
/// <c>InstanceNotFound = true</c> know the instance was unknown on the site and
|
||||
/// should distinguish that from a deployed-but-empty instance
|
||||
/// (<c>InstanceNotFound = false</c>, empty <see cref="AttributeValues"/> and
|
||||
/// <see cref="AlarmStates"/>).
|
||||
/// </para>
|
||||
/// <para>
|
||||
/// A new dedicated message type (<c>DebugViewInstanceNotFound</c>) was
|
||||
/// considered but rejected: the ClusterClient / ClusterClientReceptionist
|
||||
/// channel is typed on the request side and the bridge actor is already
|
||||
/// pattern-matching on <c>DebugViewSnapshot</c> for the initial-snapshot TCS
|
||||
/// in <c>DebugStreamService</c>. Introducing a second reply type would require
|
||||
/// every consumer to handle an additional <c>Ask</c> result union — more change
|
||||
/// for no additive-safety gain. The defaulted field is strictly additive and
|
||||
/// keeps all call sites untouched.
|
||||
/// </para>
|
||||
/// </remarks>
|
||||
public record DebugViewSnapshot(
|
||||
string InstanceUniqueName,
|
||||
IReadOnlyList<AttributeValueChanged> AttributeValues,
|
||||
IReadOnlyList<AlarmStateChanged> AlarmStates,
|
||||
DateTimeOffset SnapshotTimestamp);
|
||||
DateTimeOffset SnapshotTimestamp,
|
||||
// M2.11 — additive field: true when the requested instance is not registered
|
||||
// on this site. Defaults to false so all existing call sites and wire
|
||||
// frames are unaffected.
|
||||
bool InstanceNotFound = false);
|
||||
|
||||
@@ -40,7 +40,14 @@ public record SiteHealthReport(
|
||||
// hosted service every 30 s. Defaults to null so existing producers /
|
||||
// tests that don't refresh the snapshot stay valid; the central health
|
||||
// surface treats null as "no data yet" rather than a zeroed queue.
|
||||
SiteAuditBacklogSnapshot? SiteAuditBacklog = null);
|
||||
SiteAuditBacklogSnapshot? SiteAuditBacklog = null,
|
||||
// Site Event Logging (#12) M2.16 (#30): cumulative count of event-log write
|
||||
// failures (SQLite error, disk full, bounded-queue overflow drop) since the
|
||||
// logger was created. Populated by the site-side SiteEventLogFailureCountReporter
|
||||
// hosted service. Point-in-time (not reset on collect) — mirrors the
|
||||
// SiteAuditBacklog pattern. Defaults to 0 so existing producers / tests that
|
||||
// don't wire the poller stay valid.
|
||||
long SiteEventLogWriteFailures = 0);
|
||||
|
||||
/// <summary>
|
||||
/// Broadcast wrapper used between central nodes to keep per-node
|
||||
|
||||
@@ -12,8 +12,8 @@ public sealed record ConfigurationDiff
|
||||
public string? OldRevisionHash { get; init; }
|
||||
/// <summary>Revision hash of the new configuration being compared.</summary>
|
||||
public string? NewRevisionHash { get; init; }
|
||||
/// <summary>True when any attribute, alarm, or script changes are present.</summary>
|
||||
public bool HasChanges => AttributeChanges.Count > 0 || AlarmChanges.Count > 0 || ScriptChanges.Count > 0;
|
||||
/// <summary>True when any attribute, alarm, script, or connection changes are present.</summary>
|
||||
public bool HasChanges => AttributeChanges.Count > 0 || AlarmChanges.Count > 0 || ScriptChanges.Count > 0 || ConnectionChanges.Count > 0;
|
||||
|
||||
/// <summary>Diff entries for resolved attributes.</summary>
|
||||
public IReadOnlyList<DiffEntry<ResolvedAttribute>> AttributeChanges { get; init; } = [];
|
||||
@@ -21,6 +21,13 @@ public sealed record ConfigurationDiff
|
||||
public IReadOnlyList<DiffEntry<ResolvedAlarm>> AlarmChanges { get; init; } = [];
|
||||
/// <summary>Diff entries for resolved scripts.</summary>
|
||||
public IReadOnlyList<DiffEntry<ResolvedScript>> ScriptChanges { get; init; } = [];
|
||||
|
||||
/// <summary>
|
||||
/// Diff entries for connection configurations, keyed by connection name.
|
||||
/// Surfaces standalone endpoint/protocol/failover drift that does not show
|
||||
/// up as a per-attribute binding change (TemplateEngine-018).
|
||||
/// </summary>
|
||||
public IReadOnlyList<DiffEntry<ConnectionConfig>> ConnectionChanges { get; init; } = [];
|
||||
}
|
||||
|
||||
/// <summary>
|
||||
|
||||
@@ -174,6 +174,14 @@ public sealed record ResolvedScript
|
||||
|
||||
/// <summary>Gets the minimum time between script executions.</summary>
|
||||
public TimeSpan? MinTimeBetweenRuns { get; init; }
|
||||
|
||||
/// <summary>
|
||||
/// Per-script execution timeout in seconds, or null to use the site's global
|
||||
/// default. A non-positive value is treated as null (use global) by the Site
|
||||
/// Runtime. Seconds (not TimeSpan) to match the global option it overrides.
|
||||
/// </summary>
|
||||
public int? ExecutionTimeoutSeconds { get; init; }
|
||||
|
||||
/// <summary>Gets the source of this script.</summary>
|
||||
public string Source { get; init; } = "Template";
|
||||
|
||||
|
||||
@@ -0,0 +1,393 @@
|
||||
using System.Text.Json;
|
||||
|
||||
namespace ZB.MOM.WW.ScadaBridge.Commons.Types.InboundApi;
|
||||
|
||||
/// <summary>
|
||||
/// Recursive, persistence-ignorant model of an inbound-API parameter or
|
||||
/// return-value type definition. This is the deserialized form of the JSON
|
||||
/// Schema stored in <c>ApiMethod.ParameterDefinitions</c> / <c>ReturnDefinition</c>
|
||||
/// (and the equivalent TemplateScript / SharedScript columns), the canonical
|
||||
/// format produced by the Central UI schema builder and the
|
||||
/// <c>MigrateParametersToJsonSchema</c> migration.
|
||||
///
|
||||
/// <para>
|
||||
/// Unlike the flat <see cref="ParameterDefinition"/> (name → scalar type, no
|
||||
/// nesting), an <see cref="InboundApiSchema"/> carries the FULL nested type:
|
||||
/// an <c>object</c> node carries its declared field schemas (and which fields
|
||||
/// are required); an <c>array</c> node carries its element schema. This lets
|
||||
/// callers validate complex request/response structures field-by-field and
|
||||
/// element-by-element to any depth, with path-qualified errors
|
||||
/// (e.g. <c>order.items[2].quantity</c>).
|
||||
/// </para>
|
||||
///
|
||||
/// <para>
|
||||
/// The extended type vocabulary (after normalization) is the JSON Schema set:
|
||||
/// <c>boolean · integer · number · string · object · array</c>. Legacy aliases
|
||||
/// (<c>bool</c>, <c>int</c>, <c>float</c>, <c>double</c>, <c>list</c>, …) are
|
||||
/// accepted on parse for transition safety, mirroring the Central UI
|
||||
/// <c>SchemaBuilderModel</c> / <c>JsonSchemaShapeParser</c> conventions.
|
||||
/// </para>
|
||||
/// </summary>
|
||||
public sealed class InboundApiSchema
|
||||
{
|
||||
/// <summary>Normalized JSON Schema type: one of <c>boolean · integer · number · string · object · array</c>.</summary>
|
||||
public string Type { get; init; } = "string";
|
||||
|
||||
/// <summary>For <see cref="Type"/> = <c>object</c>: the declared fields, in declaration order.</summary>
|
||||
public IReadOnlyList<InboundApiSchemaField> Fields { get; init; } = [];
|
||||
|
||||
/// <summary>For <see cref="Type"/> = <c>array</c>: the schema every element must satisfy; null means element type was not declared (shape-only).</summary>
|
||||
public InboundApiSchema? Items { get; init; }
|
||||
|
||||
/// <summary>Maximum allowed schema nesting depth for both Parse and Validate recursion.</summary>
|
||||
private const int MaxDepth = 32;
|
||||
|
||||
// Allow the JSON reader to parse schemas up to ~3× our structural ceiling so
|
||||
// the application-level ParseSchema depth guard (MaxDepth = 32) fires before
|
||||
// the System.Text.Json reader ceiling. Each structural level contributes
|
||||
// roughly 3 JSON-reader nesting levels (object → properties-object → value),
|
||||
// so 128 reader levels comfortably accommodates 32+ structural levels.
|
||||
private static readonly JsonDocumentOptions DocOptions = new() { MaxDepth = 128 };
|
||||
|
||||
/// <summary>
|
||||
/// Parses a stored definition string into an <see cref="InboundApiSchema"/>.
|
||||
/// Accepts the canonical JSON Schema object form
|
||||
/// (<c>{"type":"object","properties":{…},"required":[…]}</c>) and, for
|
||||
/// transition safety, the legacy flat-array parameter form
|
||||
/// (<c>[{name,type,required,itemType?}]</c>) which it treats as an object
|
||||
/// schema whose properties are the array entries.
|
||||
/// </summary>
|
||||
/// <param name="json">The definition JSON; null/whitespace yields <c>null</c>.</param>
|
||||
/// <returns>The parsed schema, or <c>null</c> when the input is empty.</returns>
|
||||
/// <exception cref="JsonException">The input is non-empty but not valid JSON, is a JSON scalar/null at the root, or the schema nesting exceeds <see cref="MaxDepth"/>.</exception>
|
||||
public static InboundApiSchema? Parse(string? json)
|
||||
{
|
||||
if (string.IsNullOrWhiteSpace(json))
|
||||
{
|
||||
return null;
|
||||
}
|
||||
|
||||
using var doc = JsonDocument.Parse(json, DocOptions);
|
||||
return doc.RootElement.ValueKind switch
|
||||
{
|
||||
JsonValueKind.Object => ParseSchema(doc.RootElement, depth: 0),
|
||||
JsonValueKind.Array => ParseLegacyArray(doc.RootElement),
|
||||
_ => throw new JsonException("Type definition must be a JSON object (JSON Schema) or legacy parameter array."),
|
||||
};
|
||||
}
|
||||
|
||||
private static InboundApiSchema ParseSchema(JsonElement el, int depth)
|
||||
{
|
||||
if (depth > MaxDepth)
|
||||
{
|
||||
throw new JsonException($"Schema nesting exceeds the maximum allowed depth of {MaxDepth}.");
|
||||
}
|
||||
|
||||
var type = el.TryGetProperty("type", out var t) && t.ValueKind == JsonValueKind.String
|
||||
? NormalizeType(t.GetString())
|
||||
: "string";
|
||||
|
||||
if (type == "array")
|
||||
{
|
||||
InboundApiSchema? items = null;
|
||||
if (el.TryGetProperty("items", out var itemsEl) && itemsEl.ValueKind == JsonValueKind.Object)
|
||||
{
|
||||
items = ParseSchema(itemsEl, depth + 1);
|
||||
}
|
||||
|
||||
return new InboundApiSchema { Type = "array", Items = items };
|
||||
}
|
||||
|
||||
if (type == "object")
|
||||
{
|
||||
var requiredSet = new HashSet<string>(StringComparer.Ordinal);
|
||||
if (el.TryGetProperty("required", out var req) && req.ValueKind == JsonValueKind.Array)
|
||||
{
|
||||
foreach (var r in req.EnumerateArray())
|
||||
{
|
||||
if (r.ValueKind == JsonValueKind.String)
|
||||
{
|
||||
var s = r.GetString();
|
||||
if (!string.IsNullOrEmpty(s))
|
||||
{
|
||||
requiredSet.Add(s);
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
var fields = new List<InboundApiSchemaField>();
|
||||
if (el.TryGetProperty("properties", out var props) && props.ValueKind == JsonValueKind.Object)
|
||||
{
|
||||
foreach (var prop in props.EnumerateObject())
|
||||
{
|
||||
var schema = prop.Value.ValueKind == JsonValueKind.Object
|
||||
? ParseSchema(prop.Value, depth + 1)
|
||||
: new InboundApiSchema { Type = "string" };
|
||||
fields.Add(new InboundApiSchemaField(prop.Name, requiredSet.Contains(prop.Name), schema));
|
||||
}
|
||||
}
|
||||
|
||||
return new InboundApiSchema { Type = "object", Fields = fields };
|
||||
}
|
||||
|
||||
return new InboundApiSchema { Type = type };
|
||||
}
|
||||
|
||||
private static InboundApiSchema ParseLegacyArray(JsonElement arr)
|
||||
{
|
||||
var fields = new List<InboundApiSchemaField>();
|
||||
foreach (var item in arr.EnumerateArray())
|
||||
{
|
||||
if (item.ValueKind != JsonValueKind.Object)
|
||||
{
|
||||
continue;
|
||||
}
|
||||
|
||||
// The legacy flat shape historically appeared with both PascalCase
|
||||
// (CLI / anonymous-object serialization read back with
|
||||
// PropertyNameCaseInsensitive) and lowercase (DB) keys, so the
|
||||
// property lookups here are case-insensitive for compatibility.
|
||||
var name = TryGetMember(item, "name", out var n) ? n.GetString() : null;
|
||||
if (string.IsNullOrEmpty(name))
|
||||
{
|
||||
continue;
|
||||
}
|
||||
|
||||
var rawType = TryGetMember(item, "type", out var t) ? t.GetString() : "string";
|
||||
|
||||
// A field is optional only when "required" is explicitly false.
|
||||
// The SQL migration uses a string comparison (LOWER(...) <> 'false'),
|
||||
// so we must also accept the string "false" (case-insensitive) here —
|
||||
// not only the JSON boolean false — to stay consistent with legacy rows
|
||||
// that stored "required":"false" as a string.
|
||||
var required = !TryGetMember(item, "required", out var rq)
|
||||
|| (rq.ValueKind != JsonValueKind.False
|
||||
&& !string.Equals(
|
||||
rq.ValueKind == JsonValueKind.String ? rq.GetString() : null,
|
||||
"false",
|
||||
StringComparison.OrdinalIgnoreCase));
|
||||
|
||||
var normalized = NormalizeType(rawType);
|
||||
InboundApiSchema schema;
|
||||
if (normalized == "array")
|
||||
{
|
||||
var inner = TryGetMember(item, "itemType", out var it) ? it.GetString() : null;
|
||||
schema = new InboundApiSchema
|
||||
{
|
||||
Type = "array",
|
||||
Items = string.IsNullOrEmpty(inner) ? null : new InboundApiSchema { Type = NormalizeType(inner) },
|
||||
};
|
||||
}
|
||||
else
|
||||
{
|
||||
schema = new InboundApiSchema { Type = normalized };
|
||||
}
|
||||
|
||||
fields.Add(new InboundApiSchemaField(name!, required, schema));
|
||||
}
|
||||
|
||||
return new InboundApiSchema { Type = "object", Fields = fields };
|
||||
}
|
||||
|
||||
/// <summary>
|
||||
/// Case-insensitive object-member lookup, used only on the legacy flat-array
|
||||
/// path so both PascalCase and lowercase legacy keys resolve.
|
||||
/// </summary>
|
||||
private static bool TryGetMember(JsonElement obj, string name, out JsonElement value)
|
||||
{
|
||||
foreach (var prop in obj.EnumerateObject())
|
||||
{
|
||||
if (string.Equals(prop.Name, name, StringComparison.OrdinalIgnoreCase))
|
||||
{
|
||||
value = prop.Value;
|
||||
return true;
|
||||
}
|
||||
}
|
||||
|
||||
value = default;
|
||||
return false;
|
||||
}
|
||||
|
||||
/// <summary>
|
||||
/// Normalizes a raw type token to the canonical JSON Schema vocabulary,
|
||||
/// tolerating legacy aliases. Unknown tokens are returned lowercased so the
|
||||
/// validator can surface an explicit "unknown type" error.
|
||||
/// </summary>
|
||||
/// <param name="raw">The raw type token (may be null).</param>
|
||||
/// <returns>The normalized type token.</returns>
|
||||
public static string NormalizeType(string? raw) => raw?.ToLowerInvariant() switch
|
||||
{
|
||||
null or "" => "string",
|
||||
"boolean" or "bool" => "boolean",
|
||||
"integer" or "int" or "int32" or "int64" => "integer",
|
||||
"number" or "float" or "double" or "decimal" => "number",
|
||||
// datetime→string is intentional: the legacy migration's SQL
|
||||
// normalization function maps "datetime" to "string" (no separate
|
||||
// datetime wire type in the extended type system), so C# must match.
|
||||
"string" or "datetime" => "string",
|
||||
"object" => "object",
|
||||
"array" or "list" => "array",
|
||||
var other => other,
|
||||
};
|
||||
|
||||
/// <summary>
|
||||
/// Recursively validates a JSON value against this schema. A JSON <c>null</c>
|
||||
/// satisfies any type (a present-but-null field is allowed; absence of a
|
||||
/// required field is reported by the parent object). Errors are accumulated
|
||||
/// with a path prefix (e.g. <c>order.items[2].quantity</c>) so the caller can
|
||||
/// pinpoint the offending field.
|
||||
/// </summary>
|
||||
/// <param name="value">The JSON value to validate.</param>
|
||||
/// <param name="path">The path prefix for the value being validated (empty for the root).</param>
|
||||
/// <param name="errors">Accumulator the validator appends path-qualified messages to.</param>
|
||||
public void Validate(JsonElement value, string path, List<string> errors)
|
||||
=> ValidateCore(value, path, errors, depth: 0);
|
||||
|
||||
private void ValidateCore(JsonElement value, string path, List<string> errors, int depth)
|
||||
{
|
||||
ArgumentNullException.ThrowIfNull(errors);
|
||||
|
||||
if (depth > MaxDepth)
|
||||
{
|
||||
errors.Add($"{Describe(path)}: schema nesting too deep (max {MaxDepth})");
|
||||
return;
|
||||
}
|
||||
|
||||
// A null value satisfies any declared type — a present-but-null field is
|
||||
// allowed; a MISSING required field is reported by the enclosing object.
|
||||
if (value.ValueKind == JsonValueKind.Null)
|
||||
{
|
||||
return;
|
||||
}
|
||||
|
||||
switch (Type)
|
||||
{
|
||||
case "boolean":
|
||||
if (value.ValueKind is not (JsonValueKind.True or JsonValueKind.False))
|
||||
{
|
||||
errors.Add(Mismatch(path, "Boolean"));
|
||||
}
|
||||
|
||||
break;
|
||||
|
||||
case "integer":
|
||||
if (value.ValueKind != JsonValueKind.Number || !value.TryGetInt64(out _))
|
||||
{
|
||||
errors.Add(Mismatch(path, "Integer"));
|
||||
}
|
||||
|
||||
break;
|
||||
|
||||
case "number":
|
||||
if (value.ValueKind != JsonValueKind.Number)
|
||||
{
|
||||
errors.Add(Mismatch(path, "Float"));
|
||||
}
|
||||
|
||||
break;
|
||||
|
||||
case "string":
|
||||
if (value.ValueKind != JsonValueKind.String)
|
||||
{
|
||||
errors.Add(Mismatch(path, "String"));
|
||||
}
|
||||
|
||||
break;
|
||||
|
||||
case "object":
|
||||
ValidateObject(value, path, errors, depth);
|
||||
break;
|
||||
|
||||
case "array":
|
||||
ValidateArray(value, path, errors, depth);
|
||||
break;
|
||||
|
||||
default:
|
||||
errors.Add($"{Describe(path)} has unknown declared type '{Type}'");
|
||||
break;
|
||||
}
|
||||
}
|
||||
|
||||
private void ValidateObject(JsonElement value, string path, List<string> errors, int depth)
|
||||
{
|
||||
if (value.ValueKind != JsonValueKind.Object)
|
||||
{
|
||||
errors.Add(Mismatch(path, "Object"));
|
||||
return;
|
||||
}
|
||||
|
||||
// Reject undeclared fields (defensive, consistent with InboundAPI-010's
|
||||
// top-level "unexpected parameter" rejection) — a typo'd nested field is
|
||||
// surfaced instead of silently ignored. Skipped when no fields are
|
||||
// declared (a bare {"type":"object"} stays shape-only, like the legacy
|
||||
// behaviour and the array-without-items case).
|
||||
if (Fields.Count > 0)
|
||||
{
|
||||
var declared = new HashSet<string>(Fields.Select(f => f.Name), StringComparer.Ordinal);
|
||||
foreach (var prop in value.EnumerateObject())
|
||||
{
|
||||
if (!declared.Contains(prop.Name))
|
||||
{
|
||||
errors.Add($"{Describe(JoinField(path, prop.Name))} is not a declared field");
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
foreach (var field in Fields)
|
||||
{
|
||||
var fieldPath = JoinField(path, field.Name);
|
||||
if (value.TryGetProperty(field.Name, out var fieldValue))
|
||||
{
|
||||
field.Schema.ValidateCore(fieldValue, fieldPath, errors, depth + 1);
|
||||
}
|
||||
else if (field.Required)
|
||||
{
|
||||
errors.Add($"missing required field {Describe(fieldPath)}");
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
private void ValidateArray(JsonElement value, string path, List<string> errors, int depth)
|
||||
{
|
||||
if (value.ValueKind != JsonValueKind.Array)
|
||||
{
|
||||
errors.Add(Mismatch(path, "List"));
|
||||
return;
|
||||
}
|
||||
|
||||
// No declared element type → shape-only (any elements accepted).
|
||||
if (Items is null)
|
||||
{
|
||||
return;
|
||||
}
|
||||
|
||||
var index = 0;
|
||||
foreach (var element in value.EnumerateArray())
|
||||
{
|
||||
Items.ValidateCore(element, $"{path}[{index}]", errors, depth + 1);
|
||||
index++;
|
||||
}
|
||||
}
|
||||
|
||||
private static string Mismatch(string path, string expectedDisplayType) =>
|
||||
$"{Describe(path)} must be {Article(expectedDisplayType)} {expectedDisplayType}";
|
||||
|
||||
private static string Describe(string path) =>
|
||||
string.IsNullOrEmpty(path) ? "value" : $"'{path}'";
|
||||
|
||||
private static string JoinField(string path, string field) =>
|
||||
string.IsNullOrEmpty(path) ? field : $"{path}.{field}";
|
||||
|
||||
private static string Article(string word) =>
|
||||
word.Length > 0 && "AEIOU".IndexOf(char.ToUpperInvariant(word[0])) >= 0 ? "an" : "a";
|
||||
}
|
||||
|
||||
/// <summary>
|
||||
/// One declared field of an <see cref="InboundApiSchema"/> object node: the
|
||||
/// field name, whether it is required, and its (recursive) type schema.
|
||||
/// </summary>
|
||||
/// <param name="Name">The field name as it appears in the JSON.</param>
|
||||
/// <param name="Required">Whether the field must be present.</param>
|
||||
/// <param name="Schema">The recursive type schema the field's value must satisfy.</param>
|
||||
public sealed record InboundApiSchemaField(string Name, bool Required, InboundApiSchema Schema);
|
||||
@@ -10,10 +10,24 @@ namespace ZB.MOM.WW.ScadaBridge.Communication.Actors;
|
||||
/// Long-lived (one per active debug session) actor on the central side. Debug sessions
|
||||
/// are session-based and temporary — this actor holds no persisted state and does not
|
||||
/// derive from an Akka.Persistence base class; its state does not survive a restart.
|
||||
/// Sends SubscribeDebugViewRequest to the site via CentralCommunicationActor (with THIS actor
|
||||
/// as the Sender) to get the initial snapshot. After receiving the snapshot, opens a gRPC
|
||||
/// server-streaming subscription via SiteStreamGrpcClient for ongoing events.
|
||||
/// Stream events are marshalled back to the actor via Self.Tell for thread safety.
|
||||
/// <para>
|
||||
/// <b>Stream-first lifecycle (M2.18, #26).</b> To avoid losing any
|
||||
/// <see cref="AttributeValueChanged"/>/<see cref="AlarmStateChanged"/> that occurs on
|
||||
/// the site during the snapshot-build + network-transit window, the gRPC server-streaming
|
||||
/// subscription is opened FIRST (in <see cref="PreStart"/>), alongside the
|
||||
/// <c>SubscribeDebugViewRequest</c> sent to the site via CentralCommunicationActor (with
|
||||
/// THIS actor as the Sender). Live events that arrive before the
|
||||
/// <see cref="DebugViewSnapshot"/> is delivered are <em>buffered in arrival order</em>.
|
||||
/// When the snapshot arrives it is delivered to the consumer, then the buffer is flushed
|
||||
/// in order, <em>deduped</em> against the snapshot (an event whose per-entity timestamp is
|
||||
/// <= the snapshot's timestamp for the same entity is already reflected → dropped; a
|
||||
/// strictly-newer event is delivered; an event for an entity absent from the snapshot is
|
||||
/// delivered). After the flush the actor switches to pass-through: subsequent events go
|
||||
/// straight to the consumer. A mid-session reconnect (after the snapshot) resumes
|
||||
/// pass-through — the snapshot is a one-time thing.
|
||||
/// </para>
|
||||
/// Stream events are marshalled back to the actor via Self.Tell for thread safety; all
|
||||
/// state (phase flag + buffer) is mutated only on the actor thread.
|
||||
/// </summary>
|
||||
public class DebugStreamBridgeActor : ReceiveActor, IWithTimers
|
||||
{
|
||||
@@ -49,6 +63,31 @@ public class DebugStreamBridgeActor : ReceiveActor, IWithTimers
|
||||
private bool _stopped;
|
||||
private CancellationTokenSource? _grpcCts;
|
||||
|
||||
/// <summary>
|
||||
/// Phase flag (M2.18). <see langword="false"/> until the initial
|
||||
/// <see cref="DebugViewSnapshot"/> has been delivered and the pre-snapshot buffer
|
||||
/// flushed; <see langword="true"/> thereafter (pass-through). Mutated only on the
|
||||
/// actor thread. A reconnect does NOT touch this flag — a mid-session reconnect
|
||||
/// (after the snapshot) therefore stays in pass-through, and a reconnect during the
|
||||
/// buffering phase (before the snapshot) stays buffering.
|
||||
/// </summary>
|
||||
private bool _snapshotDelivered;
|
||||
|
||||
/// <summary>
|
||||
/// Ordered buffer of live gRPC events (<see cref="AttributeValueChanged"/>/
|
||||
/// <see cref="AlarmStateChanged"/>) that arrived before the snapshot was delivered.
|
||||
/// Flushed (with per-entity dedup against the snapshot) when the snapshot arrives,
|
||||
/// then never used again. Mutated only on the actor thread.
|
||||
/// </summary>
|
||||
private readonly List<object> _preSnapshotBuffer = new();
|
||||
|
||||
/// <summary>
|
||||
/// Defensive log threshold: if the pre-snapshot buffer grows past this many events
|
||||
/// during a slow snapshot we log once (events are NOT dropped — the window is short).
|
||||
/// </summary>
|
||||
private const int BufferWarnThreshold = 10_000;
|
||||
private bool _bufferWarned;
|
||||
|
||||
/// <summary>Timer scheduler for reconnect and stability window timers.</summary>
|
||||
public ITimerScheduler Timers { get; set; } = null!;
|
||||
|
||||
@@ -85,13 +124,55 @@ public class DebugStreamBridgeActor : ReceiveActor, IWithTimers
|
||||
_grpcNodeAAddress = grpcNodeAAddress;
|
||||
_grpcNodeBAddress = grpcNodeBAddress;
|
||||
|
||||
// Initial snapshot response from the site (via ClusterClient)
|
||||
// Initial snapshot response from the site (via ClusterClient).
|
||||
// M2.11: if the site reports InstanceNotFound=true the instance is not
|
||||
// deployed there. M2.18: under the stream-first lifecycle the gRPC stream
|
||||
// was already opened in PreStart, so the not-found path must tear it down
|
||||
// (CleanupGrpc) rather than enter pass-through. Forward the snapshot (with
|
||||
// InstanceNotFound=true) to _onEvent so DebugStreamService's TCS resolves and
|
||||
// the caller can inspect the flag; then stop cleanly.
|
||||
Receive<DebugViewSnapshot>(snapshot =>
|
||||
{
|
||||
_log.Info("Received initial snapshot for {0} ({1} attrs, {2} alarms)",
|
||||
_instanceUniqueName, snapshot.AttributeValues.Count, snapshot.AlarmStates.Count);
|
||||
if (_snapshotDelivered)
|
||||
{
|
||||
// Defensive: a duplicate / late snapshot after we have already moved to
|
||||
// pass-through. The snapshot is a one-time thing — ignore replays so we
|
||||
// never re-buffer or double-deliver.
|
||||
_log.Debug("Ignoring duplicate DebugViewSnapshot for {0} (already delivered)",
|
||||
_instanceUniqueName);
|
||||
return;
|
||||
}
|
||||
|
||||
if (snapshot.InstanceNotFound)
|
||||
{
|
||||
_log.Warning("Instance {0} is not deployed on site; terminating debug stream",
|
||||
_instanceUniqueName);
|
||||
// M2.18: the stream-first subscription opened in PreStart is for a
|
||||
// non-deployed instance — cancel it (and any buffered gap events are
|
||||
// discarded with the actor). No pass-through.
|
||||
// _stopped is set AFTER CleanupGrpc() to match the ordering in the
|
||||
// DebugStreamTerminated and ReceiveTimeout handlers (cosmetic consistency).
|
||||
CleanupGrpc();
|
||||
_stopped = true;
|
||||
_preSnapshotBuffer.Clear();
|
||||
_onEvent(snapshot); // resolves the snapshot TCS with InstanceNotFound=true
|
||||
// Note: after Context.Stop(Self) below the actor is dead. DebugStreamService
|
||||
// inspects InitialSnapshot.InstanceNotFound and calls StopStream, which sends
|
||||
// a StopDebugStream message. That Tell arrives after the actor has already
|
||||
// stopped, producing a benign Akka dead-letter — expected and harmless.
|
||||
Context.Stop(Self);
|
||||
return;
|
||||
}
|
||||
|
||||
_log.Info("Received initial snapshot for {0} ({1} attrs, {2} alarms); flushing {3} buffered event(s)",
|
||||
_instanceUniqueName, snapshot.AttributeValues.Count, snapshot.AlarmStates.Count,
|
||||
_preSnapshotBuffer.Count);
|
||||
|
||||
// Deliver the snapshot, then flush the gap-window buffer (deduped), then
|
||||
// switch to pass-through. Order matters: snapshot first, buffered events next.
|
||||
_onEvent(snapshot);
|
||||
OpenGrpcStream();
|
||||
FlushBuffer(snapshot);
|
||||
_snapshotDelivered = true;
|
||||
});
|
||||
|
||||
// Domain events arriving via Self.Tell from gRPC callback.
|
||||
@@ -99,8 +180,11 @@ public class DebugStreamBridgeActor : ReceiveActor, IWithTimers
|
||||
// flapping stream that delivers a single event between failures would
|
||||
// otherwise never trip MaxRetries. The retry budget is recovered only by
|
||||
// GrpcStreamStable (a stream that has stayed up for StabilityWindow).
|
||||
Receive<AttributeValueChanged>(changed => _onEvent(changed));
|
||||
Receive<AlarmStateChanged>(changed => _onEvent(changed));
|
||||
// M2.18: before the snapshot has been delivered, BUFFER (in arrival order)
|
||||
// rather than deliver — these may be gap-window events. After the snapshot has
|
||||
// been flushed, pass through directly (same handler, phase-dependent behavior).
|
||||
Receive<AttributeValueChanged>(changed => HandleStreamEvent(changed));
|
||||
Receive<AlarmStateChanged>(changed => HandleStreamEvent(changed));
|
||||
|
||||
// Stream has been stably connected for StabilityWindow — recover the
|
||||
// retry budget so a future transient fault gets a fresh set of retries.
|
||||
@@ -155,11 +239,161 @@ public class DebugStreamBridgeActor : ReceiveActor, IWithTimers
|
||||
});
|
||||
}
|
||||
|
||||
/// <summary>
|
||||
/// Handles a live gRPC stream event (<see cref="AttributeValueChanged"/> or
|
||||
/// <see cref="AlarmStateChanged"/>). Before the snapshot has been delivered the
|
||||
/// event is appended to the ordered pre-snapshot buffer (gap-window capture); after
|
||||
/// the snapshot+flush it is passed straight through to the consumer. Always runs on
|
||||
/// the actor thread (events are marshalled in via Self.Tell), so the phase flag and
|
||||
/// buffer are accessed without locking.
|
||||
/// </summary>
|
||||
private void HandleStreamEvent(object evt)
|
||||
{
|
||||
if (_snapshotDelivered)
|
||||
{
|
||||
_onEvent(evt);
|
||||
return;
|
||||
}
|
||||
|
||||
_preSnapshotBuffer.Add(evt);
|
||||
if (!_bufferWarned && _preSnapshotBuffer.Count > BufferWarnThreshold)
|
||||
{
|
||||
_bufferWarned = true;
|
||||
_log.Warning(
|
||||
"Pre-snapshot debug-event buffer for {0} exceeded {1} events while awaiting the snapshot; " +
|
||||
"events are still retained (not dropped).",
|
||||
_instanceUniqueName, BufferWarnThreshold);
|
||||
}
|
||||
}
|
||||
|
||||
/// <summary>
|
||||
/// Flushes the pre-snapshot buffer in arrival order, deduping each event against the
|
||||
/// just-delivered snapshot (M2.18).
|
||||
/// <para>
|
||||
/// <b>Dedup rule.</b> Identity is per-entity:
|
||||
/// attributes by (InstanceUniqueName, AttributePath, AttributeName); alarms by
|
||||
/// (InstanceUniqueName, AlarmName, SourceReference). For a buffered event whose entity
|
||||
/// is present in the snapshot, the comparison is against that entity's snapshot
|
||||
/// timestamp: a buffered timestamp <= the snapshot timestamp means the event is
|
||||
/// already reflected in the snapshot → DROP; a strictly-newer (>) timestamp means
|
||||
/// the event happened after the snapshot was built → DELIVER. The boundary is inclusive
|
||||
/// on the snapshot side (equal timestamps are treated as duplicates) — the snapshot is
|
||||
/// the authoritative point-in-time value, so an event at the exact same instant carries
|
||||
/// no new information. A buffered event whose entity is NOT in the snapshot is a genuine
|
||||
/// gap-window event → DELIVER.
|
||||
/// </para>
|
||||
/// </summary>
|
||||
private void FlushBuffer(DebugViewSnapshot snapshot)
|
||||
{
|
||||
if (_preSnapshotBuffer.Count == 0) return;
|
||||
|
||||
// Build per-entity "as-of" timestamps from the snapshot. If (defensively) the
|
||||
// snapshot lists the same entity twice, keep the newest timestamp.
|
||||
var attrAsOf = new Dictionary<string, DateTimeOffset>();
|
||||
foreach (var a in snapshot.AttributeValues)
|
||||
{
|
||||
var key = AttributeKey(a);
|
||||
if (!attrAsOf.TryGetValue(key, out var existing) || a.Timestamp > existing)
|
||||
attrAsOf[key] = a.Timestamp;
|
||||
}
|
||||
|
||||
var alarmAsOf = new Dictionary<string, DateTimeOffset>();
|
||||
foreach (var al in snapshot.AlarmStates)
|
||||
{
|
||||
var key = AlarmKey(al);
|
||||
if (!alarmAsOf.TryGetValue(key, out var existing) || al.Timestamp > existing)
|
||||
alarmAsOf[key] = al.Timestamp;
|
||||
}
|
||||
|
||||
var flushed = 0;
|
||||
var dropped = 0;
|
||||
foreach (var evt in _preSnapshotBuffer)
|
||||
{
|
||||
if (IsReflectedInSnapshot(evt, attrAsOf, alarmAsOf))
|
||||
{
|
||||
dropped++;
|
||||
continue;
|
||||
}
|
||||
|
||||
_onEvent(evt);
|
||||
flushed++;
|
||||
}
|
||||
|
||||
if (dropped > 0 || flushed > 0)
|
||||
{
|
||||
_log.Debug("Flushed {0} buffered debug event(s) for {1}, dropped {2} as already-in-snapshot",
|
||||
flushed, _instanceUniqueName, dropped);
|
||||
}
|
||||
|
||||
_preSnapshotBuffer.Clear();
|
||||
}
|
||||
|
||||
/// <summary>
|
||||
/// Returns <see langword="true"/> when a buffered event is already reflected in the
|
||||
/// snapshot (same entity, buffered timestamp <= snapshot timestamp) and must be
|
||||
/// dropped; otherwise <see langword="false"/> (deliver).
|
||||
/// </summary>
|
||||
private static bool IsReflectedInSnapshot(
|
||||
object evt,
|
||||
IReadOnlyDictionary<string, DateTimeOffset> attrAsOf,
|
||||
IReadOnlyDictionary<string, DateTimeOffset> alarmAsOf)
|
||||
{
|
||||
switch (evt)
|
||||
{
|
||||
case AttributeValueChanged a:
|
||||
return attrAsOf.TryGetValue(AttributeKey(a), out var attrTs) && a.Timestamp <= attrTs;
|
||||
case AlarmStateChanged al:
|
||||
return alarmAsOf.TryGetValue(AlarmKey(al), out var alarmTs) && al.Timestamp <= alarmTs;
|
||||
default:
|
||||
// Unknown buffered type (should not happen — only attr/alarm are buffered):
|
||||
// never treat as a duplicate.
|
||||
return false;
|
||||
}
|
||||
}
|
||||
|
||||
/// <summary>
|
||||
/// Delimiter used to join identity components into a single dedup key. A NUL
|
||||
/// control character cannot appear in an instance/attribute/alarm name, so
|
||||
/// distinct identities never collide on a shared boundary (unlike a space, which
|
||||
/// may legitimately occur within a name). Declared as an escaped char so the
|
||||
/// source carries no raw NUL byte.
|
||||
/// </summary>
|
||||
private const char KeyDelimiter = '\u0000';
|
||||
|
||||
/// <summary>
|
||||
/// Per-entity dedup key for an attribute change. Each nullable component is guarded
|
||||
/// with <c>?? string.Empty</c> so a null can never silently collide with another
|
||||
/// key via <see cref="string.Concat"/> (e.g. two entries with null AttributePath
|
||||
/// would otherwise share a key with any entry whose AttributePath is the empty string).
|
||||
/// </summary>
|
||||
private static string AttributeKey(AttributeValueChanged a) =>
|
||||
string.Concat(
|
||||
a.InstanceUniqueName ?? string.Empty, KeyDelimiter,
|
||||
a.AttributePath ?? string.Empty, KeyDelimiter,
|
||||
a.AttributeName ?? string.Empty);
|
||||
|
||||
/// <summary>
|
||||
/// Per-entity dedup key for an alarm change. Includes <see cref="AlarmStateChanged.SourceReference"/>
|
||||
/// so native per-condition alarms (which share an AlarmName but differ by source
|
||||
/// reference) are not conflated; empty for computed alarms. Each nullable component is
|
||||
/// guarded with <c>?? string.Empty</c> to prevent silent key collisions.
|
||||
/// </summary>
|
||||
private static string AlarmKey(AlarmStateChanged al) =>
|
||||
string.Concat(
|
||||
al.InstanceUniqueName ?? string.Empty, KeyDelimiter,
|
||||
al.AlarmName ?? string.Empty, KeyDelimiter,
|
||||
al.SourceReference ?? string.Empty);
|
||||
|
||||
/// <inheritdoc />
|
||||
protected override void PreStart()
|
||||
{
|
||||
_log.Info("Starting debug stream bridge for {0} on site {1}", _instanceUniqueName, _siteIdentifier);
|
||||
|
||||
// M2.18 stream-first: open the gRPC live-event subscription BEFORE (and
|
||||
// alongside) requesting the snapshot, so events occurring during the
|
||||
// snapshot-build + network-transit window are captured (buffered) and not lost.
|
||||
OpenGrpcStream();
|
||||
|
||||
// Send subscribe request via CentralCommunicationActor for the initial snapshot.
|
||||
var request = new SubscribeDebugViewRequest(_instanceUniqueName, _correlationId);
|
||||
var envelope = new SiteEnvelope(_siteIdentifier, request);
|
||||
|
||||
+5
@@ -178,6 +178,11 @@ public class TemplateScriptConfiguration : IEntityTypeConfiguration<TemplateScri
|
||||
builder.Property(s => s.ReturnDefinition)
|
||||
.HasMaxLength(4000);
|
||||
|
||||
// M2.5 (#9): nullable per-script execution timeout (seconds). Null = use
|
||||
// the site's global ScriptExecutionTimeoutSeconds default.
|
||||
builder.Property(s => s.ExecutionTimeoutSeconds)
|
||||
.IsRequired(false);
|
||||
|
||||
builder.HasIndex(s => new { s.TemplateId, s.Name }).IsUnique();
|
||||
}
|
||||
}
|
||||
|
||||
+1730
File diff suppressed because it is too large
Load Diff
+28
@@ -0,0 +1,28 @@
|
||||
using Microsoft.EntityFrameworkCore.Migrations;
|
||||
|
||||
#nullable disable
|
||||
|
||||
namespace ZB.MOM.WW.ScadaBridge.ConfigurationDatabase.Migrations
|
||||
{
|
||||
/// <inheritdoc />
|
||||
public partial class ResyncLdapGroupMappingSeed : Migration
|
||||
{
|
||||
/// <inheritdoc />
|
||||
protected override void Up(MigrationBuilder migrationBuilder)
|
||||
{
|
||||
migrationBuilder.InsertData(
|
||||
table: "LdapGroupMappings",
|
||||
columns: new[] { "Id", "LdapGroupName", "Role" },
|
||||
values: new object[] { 5, "SCADA-Viewers", "Viewer" });
|
||||
}
|
||||
|
||||
/// <inheritdoc />
|
||||
protected override void Down(MigrationBuilder migrationBuilder)
|
||||
{
|
||||
migrationBuilder.DeleteData(
|
||||
table: "LdapGroupMappings",
|
||||
keyColumn: "Id",
|
||||
keyValue: 5);
|
||||
}
|
||||
}
|
||||
}
|
||||
+1733
File diff suppressed because it is too large
Load Diff
+28
@@ -0,0 +1,28 @@
|
||||
using Microsoft.EntityFrameworkCore.Migrations;
|
||||
|
||||
#nullable disable
|
||||
|
||||
namespace ZB.MOM.WW.ScadaBridge.ConfigurationDatabase.Migrations
|
||||
{
|
||||
/// <inheritdoc />
|
||||
public partial class AddTemplateScriptExecutionTimeout : Migration
|
||||
{
|
||||
/// <inheritdoc />
|
||||
protected override void Up(MigrationBuilder migrationBuilder)
|
||||
{
|
||||
migrationBuilder.AddColumn<int>(
|
||||
name: "ExecutionTimeoutSeconds",
|
||||
table: "TemplateScripts",
|
||||
type: "int",
|
||||
nullable: true);
|
||||
}
|
||||
|
||||
/// <inheritdoc />
|
||||
protected override void Down(MigrationBuilder migrationBuilder)
|
||||
{
|
||||
migrationBuilder.DropColumn(
|
||||
name: "ExecutionTimeoutSeconds",
|
||||
table: "TemplateScripts");
|
||||
}
|
||||
}
|
||||
}
|
||||
+9
@@ -925,6 +925,12 @@ namespace ZB.MOM.WW.ScadaBridge.ConfigurationDatabase.Migrations
|
||||
Id = 4,
|
||||
LdapGroupName = "SCADA-Deploy-SiteA",
|
||||
Role = "Deployer"
|
||||
},
|
||||
new
|
||||
{
|
||||
Id = 5,
|
||||
LdapGroupName = "SCADA-Viewers",
|
||||
Role = "Viewer"
|
||||
});
|
||||
});
|
||||
|
||||
@@ -1307,6 +1313,9 @@ namespace ZB.MOM.WW.ScadaBridge.ConfigurationDatabase.Migrations
|
||||
.IsRequired()
|
||||
.HasColumnType("nvarchar(max)");
|
||||
|
||||
b.Property<int?>("ExecutionTimeoutSeconds")
|
||||
.HasColumnType("int");
|
||||
|
||||
b.Property<bool>("IsInherited")
|
||||
.HasColumnType("bit");
|
||||
|
||||
|
||||
@@ -99,8 +99,14 @@ public class DataConnectionActor : UntypedActor, IWithStash, IWithTimers
|
||||
// routed to subscribers (NativeAlarmActors) by source-object reference.
|
||||
/// <summary>sourceReference → set of subscriber actor refs (NativeAlarmActors), for routing + ref-count.</summary>
|
||||
private readonly Dictionary<string, HashSet<IActorRef>> _alarmSourceSubscribers = new();
|
||||
/// <summary>sourceReference → optional condition filter (first subscriber wins).</summary>
|
||||
/// <summary>sourceReference → raw condition filter string passed to the adapter (first subscriber wins).</summary>
|
||||
private readonly Dictionary<string, string?> _alarmSourceFilter = new();
|
||||
/// <summary>
|
||||
/// sourceReference → parsed condition-type predicate (M2.4 / #8). The authoritative
|
||||
/// client-side gate in <see cref="HandleAlarmTransitionReceived"/>; applies uniformly
|
||||
/// across OPC UA and the gateway-wide MxGateway feed.
|
||||
/// </summary>
|
||||
private readonly Dictionary<string, AlarmConditionFilter> _alarmSourceFilterPredicate = new();
|
||||
/// <summary>sourceReference → adapter alarm subscription id.</summary>
|
||||
private readonly Dictionary<string, string> _alarmSubscriptionIds = new();
|
||||
/// <summary>sourceReferences whose adapter SubscribeAlarmsAsync is currently in flight.</summary>
|
||||
@@ -1480,6 +1486,9 @@ public class DataConnectionActor : UntypedActor, IWithStash, IWithTimers
|
||||
}
|
||||
subs.Add(subscriber);
|
||||
_alarmSourceFilter[request.SourceReference] = request.ConditionFilter;
|
||||
// Parse the type-name filter once; this is the authoritative client-side
|
||||
// gate consulted on every routed transition (M2.4 / #8).
|
||||
_alarmSourceFilterPredicate[request.SourceReference] = AlarmConditionFilter.Parse(request.ConditionFilter);
|
||||
|
||||
// If the adapter feed for this source is already (being) established, the
|
||||
// existing subscription serves the new subscriber too.
|
||||
@@ -1546,6 +1555,14 @@ public class DataConnectionActor : UntypedActor, IWithStash, IWithTimers
|
||||
if (!match)
|
||||
continue;
|
||||
|
||||
// M2.4 (#8): authoritative client-side condition-type gate. Applied
|
||||
// per matched source because two sources may share a prefix yet carry
|
||||
// different filters. Empty filter = allow all (historical behaviour);
|
||||
// framing sentinels (SnapshotComplete) are never dropped.
|
||||
if (_alarmSourceFilterPredicate.TryGetValue(sourceRef, out var predicate) &&
|
||||
!predicate.IsAllowed(transition))
|
||||
continue;
|
||||
|
||||
foreach (var sub in subs)
|
||||
{
|
||||
if (notified.Add(sub))
|
||||
@@ -1566,6 +1583,7 @@ public class DataConnectionActor : UntypedActor, IWithStash, IWithTimers
|
||||
// No subscribers remain for this source — tear down the adapter feed.
|
||||
_alarmSourceSubscribers.Remove(request.SourceReference);
|
||||
_alarmSourceFilter.Remove(request.SourceReference);
|
||||
_alarmSourceFilterPredicate.Remove(request.SourceReference);
|
||||
if (_alarmSubscriptionIds.Remove(request.SourceReference, out var subId) &&
|
||||
_adapter is IAlarmSubscribableConnection alarmable)
|
||||
{
|
||||
|
||||
@@ -1,3 +1,5 @@
|
||||
using System.Globalization;
|
||||
using ZB.MOM.WW.MxGateway.Client;
|
||||
using ZB.MOM.WW.MxGateway.Contracts.Proto;
|
||||
using ProtoConditionState = ZB.MOM.WW.MxGateway.Contracts.Proto.AlarmConditionState;
|
||||
using ProtoTransitionKind = ZB.MOM.WW.MxGateway.Contracts.Proto.AlarmTransitionKind;
|
||||
@@ -67,6 +69,19 @@ public static class MxGatewayAlarmMapper
|
||||
Shelve: AlarmShelveState.Unshelved, Suppressed: false, Severity: NormalizeSeverity(severity));
|
||||
}
|
||||
|
||||
/// <summary>
|
||||
/// Converts an <see cref="MxValue"/> union to a display-only string using
|
||||
/// <see cref="MxValueExtensions.ToClrValue"/> and invariant culture formatting,
|
||||
/// so numeric values always use '.' as the decimal separator. Null or unset
|
||||
/// values produce an empty string.
|
||||
/// </summary>
|
||||
internal static string MxValueToString(MxValue? mxVal)
|
||||
{
|
||||
if (mxVal is null) return "";
|
||||
var clr = mxVal.ToClrValue();
|
||||
return clr is null ? "" : Convert.ToString(clr, CultureInfo.InvariantCulture) ?? "";
|
||||
}
|
||||
|
||||
/// <summary>Maps a live <see cref="OnAlarmTransitionEvent"/> to a transition.</summary>
|
||||
/// <param name="body">The gateway alarm transition event proto message to map.</param>
|
||||
/// <returns>The protocol-neutral <see cref="NativeAlarmTransition"/>.</returns>
|
||||
@@ -83,8 +98,8 @@ public static class MxGatewayAlarmMapper
|
||||
OperatorComment: body.OperatorComment,
|
||||
OriginalRaiseTime: body.OriginalRaiseTimestamp?.ToDateTimeOffset(),
|
||||
TransitionTime: body.TransitionTimestamp?.ToDateTimeOffset() ?? DateTimeOffset.UtcNow,
|
||||
CurrentValue: "",
|
||||
LimitValue: "");
|
||||
CurrentValue: MxValueToString(body.CurrentValue),
|
||||
LimitValue: MxValueToString(body.LimitValue));
|
||||
|
||||
/// <summary>The end-of-snapshot sentinel transition (no condition payload).</summary>
|
||||
/// <returns>A <see cref="NativeAlarmTransition"/> with <c>AlarmTransitionKind.SnapshotComplete</c>.</returns>
|
||||
@@ -109,6 +124,6 @@ public static class MxGatewayAlarmMapper
|
||||
OperatorComment: snapshot.OperatorComment,
|
||||
OriginalRaiseTime: snapshot.OriginalRaiseTimestamp?.ToDateTimeOffset(),
|
||||
TransitionTime: snapshot.LastTransitionTimestamp?.ToDateTimeOffset() ?? DateTimeOffset.UtcNow,
|
||||
CurrentValue: "",
|
||||
LimitValue: "");
|
||||
CurrentValue: MxValueToString(snapshot.CurrentValue),
|
||||
LimitValue: MxValueToString(snapshot.LimitValue));
|
||||
}
|
||||
|
||||
@@ -163,7 +163,11 @@ public class MxGatewayDataConnection : IDataConnection, IBrowsableDataConnection
|
||||
_alarmCts = new CancellationTokenSource();
|
||||
var token = _alarmCts.Token;
|
||||
var client = _client!;
|
||||
// Gateway-wide feed (null prefix); the actor filters per source reference.
|
||||
// Gateway-wide feed (null prefix). The MxGateway has no server-side
|
||||
// condition filter, so conditionFilter is intentionally NOT forwarded
|
||||
// here: the DataConnectionActor applies it as the authoritative
|
||||
// client-side gate per source reference AND per condition type
|
||||
// (M2.4 / #8 — AlarmConditionFilter), uniform with the OPC UA path.
|
||||
_ = Task.Run(() => client.RunAlarmStreamAsync(null, t => callback(t), token), token);
|
||||
}
|
||||
}
|
||||
|
||||
@@ -65,4 +65,40 @@ public static class OpcUaAlarmMapper
|
||||
null or "Unshelved" => AlarmShelveState.Unshelved,
|
||||
_ => AlarmShelveState.OneShotShelved
|
||||
};
|
||||
|
||||
/// <summary>
|
||||
/// Picks a representative display-only limit value from the four standard
|
||||
/// <c>LimitAlarmType</c> set-point fields (HighHighLimit, HighLimit, LowLimit,
|
||||
/// LowLowLimit) returned by the OPC UA event SelectClause.
|
||||
///
|
||||
/// <para>
|
||||
/// The fields are absent (null raw value) on non-limit alarm types (discrete,
|
||||
/// off-normal, etc.). When present, the first non-null value is returned in
|
||||
/// priority order: HighHigh → High → Low → LowLow. The caller may use
|
||||
/// <c>AlarmTypeName</c> or <c>ConditionName</c> to determine which specific
|
||||
/// limit is active; this method intentionally returns the coarsest useful value
|
||||
/// for the common single-limit case without requiring callers to understand the
|
||||
/// OPC UA limit hierarchy.
|
||||
/// </para>
|
||||
/// </summary>
|
||||
/// <param name="highHighRaw">Raw HighHighLimit field value (null when absent).</param>
|
||||
/// <param name="highRaw">Raw HighLimit field value (null when absent).</param>
|
||||
/// <param name="lowRaw">Raw LowLimit field value (null when absent).</param>
|
||||
/// <param name="lowLowRaw">Raw LowLowLimit field value (null when absent).</param>
|
||||
/// <returns>
|
||||
/// A formatted string representation of the first non-null limit value, or an
|
||||
/// empty string when all four fields are absent (non-limit alarm type).
|
||||
/// </returns>
|
||||
public static string PickLimitValue(object? highHighRaw, object? highRaw, object? lowRaw, object? lowLowRaw)
|
||||
{
|
||||
// Standard OPC UA LimitAlarmType limit values are numeric (Double/Float/Int).
|
||||
// Convert with InvariantCulture so the decimal separator is always '.' regardless
|
||||
// of the server's locale.
|
||||
foreach (var raw in new[] { highHighRaw, highRaw, lowRaw, lowLowRaw })
|
||||
{
|
||||
if (raw is not null)
|
||||
return Convert.ToString(raw, System.Globalization.CultureInfo.InvariantCulture) ?? "";
|
||||
}
|
||||
return "";
|
||||
}
|
||||
}
|
||||
|
||||
@@ -258,7 +258,9 @@ public class RealOpcUaClient : IOpcUaClient
|
||||
MonitoringMode = MonitoringMode.Reporting,
|
||||
SamplingInterval = 0,
|
||||
QueueSize = 1000,
|
||||
Filter = BuildAlarmEventFilter()
|
||||
// Server-side WhereClause is a bandwidth optimisation only — the
|
||||
// authoritative condition-type gate lives in DataConnectionActor (M2.4 / #8).
|
||||
Filter = BuildAlarmEventFilter(AlarmConditionFilter.Parse(conditionFilter))
|
||||
};
|
||||
|
||||
item.Notification += (_, e) =>
|
||||
@@ -289,10 +291,94 @@ public class RealOpcUaClient : IOpcUaClient
|
||||
}
|
||||
|
||||
/// <summary>
|
||||
/// Builds the event filter selecting the base event fields plus the
|
||||
/// AlarmConditionType / AcknowledgeableConditionType state sub-variables we mirror.
|
||||
/// Maps the standard OPC UA Alarms & Conditions type names (case-insensitive)
|
||||
/// to their well-known <see cref="ObjectTypeIds"/> NodeIds, for building the
|
||||
/// optional server-side WhereClause (M2.4 / #8). Only standard types appear
|
||||
/// here; vendor/custom type names cannot be mapped without browsing the server
|
||||
/// type tree, so they are handled by the client-side gate alone.
|
||||
/// <para>
|
||||
/// Single source of truth for both directions: <see cref="ConditionTypeNamesById"/>
|
||||
/// is derived from this map, so the friendly-name and NodeId sides cannot drift.
|
||||
/// </para>
|
||||
/// </summary>
|
||||
private static EventFilter BuildAlarmEventFilter()
|
||||
internal static readonly IReadOnlyDictionary<string, NodeId> KnownConditionTypeIds =
|
||||
new Dictionary<string, NodeId>(StringComparer.OrdinalIgnoreCase)
|
||||
{
|
||||
["ConditionType"] = ObjectTypeIds.ConditionType,
|
||||
["AcknowledgeableConditionType"] = ObjectTypeIds.AcknowledgeableConditionType,
|
||||
["AlarmConditionType"] = ObjectTypeIds.AlarmConditionType,
|
||||
["LimitAlarmType"] = ObjectTypeIds.LimitAlarmType,
|
||||
["ExclusiveLimitAlarmType"] = ObjectTypeIds.ExclusiveLimitAlarmType,
|
||||
["NonExclusiveLimitAlarmType"] = ObjectTypeIds.NonExclusiveLimitAlarmType,
|
||||
["ExclusiveLevelAlarmType"] = ObjectTypeIds.ExclusiveLevelAlarmType,
|
||||
["NonExclusiveLevelAlarmType"] = ObjectTypeIds.NonExclusiveLevelAlarmType,
|
||||
["ExclusiveDeviationAlarmType"] = ObjectTypeIds.ExclusiveDeviationAlarmType,
|
||||
["NonExclusiveDeviationAlarmType"] = ObjectTypeIds.NonExclusiveDeviationAlarmType,
|
||||
["ExclusiveRateOfChangeAlarmType"] = ObjectTypeIds.ExclusiveRateOfChangeAlarmType,
|
||||
["NonExclusiveRateOfChangeAlarmType"] = ObjectTypeIds.NonExclusiveRateOfChangeAlarmType,
|
||||
["DiscreteAlarmType"] = ObjectTypeIds.DiscreteAlarmType,
|
||||
["OffNormalAlarmType"] = ObjectTypeIds.OffNormalAlarmType,
|
||||
["SystemOffNormalAlarmType"] = ObjectTypeIds.SystemOffNormalAlarmType,
|
||||
["TripAlarmType"] = ObjectTypeIds.TripAlarmType,
|
||||
["DiscrepancyAlarmType"] = ObjectTypeIds.DiscrepancyAlarmType,
|
||||
["InstrumentDiagnosticAlarmType"] = ObjectTypeIds.InstrumentDiagnosticAlarmType,
|
||||
["SystemDiagnosticAlarmType"] = ObjectTypeIds.SystemDiagnosticAlarmType,
|
||||
["CertificateExpirationAlarmType"] = ObjectTypeIds.CertificateExpirationAlarmType,
|
||||
};
|
||||
|
||||
/// <summary>
|
||||
/// Inverse of <see cref="KnownConditionTypeIds"/> (NodeId → friendly name), derived
|
||||
/// from it so the two cannot drift (M2.4 / #8). Used by <see cref="ResolveAlarmTypeName"/>
|
||||
/// to translate the event-type NodeId an OPC UA server sends back into the friendly
|
||||
/// type name the conditionFilter gate and server-side WhereClause both key off.
|
||||
/// </summary>
|
||||
private static readonly IReadOnlyDictionary<NodeId, string> ConditionTypeNamesById =
|
||||
KnownConditionTypeIds.ToDictionary(kv => kv.Value, kv => kv.Key);
|
||||
|
||||
/// <summary>
|
||||
/// Resolves an event-type <see cref="NodeId"/> to the friendly condition-type name the
|
||||
/// <c>conditionFilter</c> gate (and the server-side WhereClause) use (M2.4 / #8).
|
||||
///
|
||||
/// <para>
|
||||
/// Standard A&C types are returned as their friendly name (e.g. <c>i=9341</c> →
|
||||
/// <c>"ExclusiveLevelAlarmType"</c>) so the client-side gate — which compares against
|
||||
/// the friendly names in <see cref="KnownConditionTypeIds"/> — actually matches the
|
||||
/// events the server delivers. Vendor/custom subtypes that are not in the map fall back
|
||||
/// to the NodeId string; that is consistent because the WhereClause is likewise omitted
|
||||
/// for unmapped names, so such a filter can only be expressed (and matched) as the NodeId
|
||||
/// string. A <c>null</c> event type yields the empty string.
|
||||
/// </para>
|
||||
/// </summary>
|
||||
/// <param name="eventType">The event-type NodeId from the A&C notification, or <c>null</c>.</param>
|
||||
/// <returns>The friendly type name when known; otherwise the NodeId string (or "" when null).</returns>
|
||||
internal static string ResolveAlarmTypeName(NodeId? eventType)
|
||||
{
|
||||
if (eventType is null)
|
||||
return "";
|
||||
return ConditionTypeNamesById.TryGetValue(eventType, out var friendly)
|
||||
? friendly
|
||||
: eventType.ToString();
|
||||
}
|
||||
|
||||
/// <summary>
|
||||
/// Builds the event filter selecting the base event fields plus the
|
||||
/// AlarmConditionType / AcknowledgeableConditionType state sub-variables we mirror,
|
||||
/// and — when <paramref name="conditionFilter"/> is non-empty and every requested
|
||||
/// type maps to a standard A&C type — a server-side <see cref="ContentFilter"/>
|
||||
/// WhereClause (OfType, OR'd) as a bandwidth optimisation (M2.4 / #8).
|
||||
///
|
||||
/// <para>
|
||||
/// Conservative by design: if <em>any</em> requested type name cannot be mapped to
|
||||
/// a standard <see cref="ObjectTypeIds"/> NodeId, the WhereClause is omitted entirely
|
||||
/// rather than partially applied — a partial server-side filter would silently drop
|
||||
/// the unmapped types' events, and the server cannot send what it filtered out. The
|
||||
/// client-side gate in DataConnectionActor enforces the full filter regardless, so
|
||||
/// omitting the WhereClause only forgoes the bandwidth saving, never correctness.
|
||||
/// </para>
|
||||
/// </summary>
|
||||
/// <param name="conditionFilter">The parsed condition-type filter (allow-all when empty).</param>
|
||||
/// <returns>The configured <see cref="EventFilter"/>.</returns>
|
||||
internal static EventFilter BuildAlarmEventFilter(AlarmConditionFilter conditionFilter)
|
||||
{
|
||||
var filter = new EventFilter();
|
||||
foreach (var name in AlarmStateFields)
|
||||
@@ -306,9 +392,81 @@ public class RealOpcUaClient : IOpcUaClient
|
||||
filter.SelectClauses.Add(SelectField(ObjectTypeIds.AlarmConditionType, "ShelvingState", "CurrentState"));// 10
|
||||
filter.SelectClauses.Add(SelectField(ObjectTypeIds.ConditionType, "ConditionName")); // 11
|
||||
filter.SelectClauses.Add(SelectField(ObjectTypeIds.ConditionType, "Comment")); // 12
|
||||
|
||||
// APPENDED fields (indices 13+): optional — only present on specific derived types.
|
||||
// Guard all reads with fields.Count > N so base-ConditionType events still process.
|
||||
|
||||
// 13: AlarmConditionType/ActiveState/TransitionTime — the UTC instant the active-state
|
||||
// last flipped to TRUE. Mapped to OriginalRaiseTime; absent on non-AlarmCondition
|
||||
// events (ConditionType base events rarely carry it). CAVEAT: during a
|
||||
// ConditionRefresh replay the server MAY re-stamp this to the current/restart time
|
||||
// rather than the historical raise instant (OPC UA Part 9 §5.5.2 makes it advisory),
|
||||
// so a snapshot-derived OriginalRaiseTime can look like the refresh time — it is
|
||||
// display-only and not treated as authoritative.
|
||||
filter.SelectClauses.Add(SelectField(ObjectTypeIds.AlarmConditionType, "ActiveState", "TransitionTime")); // 13
|
||||
|
||||
// 14–17: LimitAlarmType limit thresholds — configuration-time set-points exposed as
|
||||
// event fields by LimitAlarmType and all its subtypes (Exclusive/NonExclusive
|
||||
// Level/Deviation/RateOfChange). Absent on non-limit alarm types (e.g. discrete,
|
||||
// off-normal) — guarded by fields.Count > N below.
|
||||
filter.SelectClauses.Add(SelectField(ObjectTypeIds.LimitAlarmType, "HighHighLimit")); // 14
|
||||
filter.SelectClauses.Add(SelectField(ObjectTypeIds.LimitAlarmType, "HighLimit")); // 15
|
||||
filter.SelectClauses.Add(SelectField(ObjectTypeIds.LimitAlarmType, "LowLimit")); // 16
|
||||
filter.SelectClauses.Add(SelectField(ObjectTypeIds.LimitAlarmType, "LowLowLimit")); // 17
|
||||
|
||||
// UNAVAILABLE via standard OPC UA A&C event fields (documented here so future
|
||||
// maintainers know these were considered, not overlooked):
|
||||
// Category — not a standard event field; server-specific extensions only.
|
||||
// Description — NativeAlarmTransition.Description is a static template description;
|
||||
// OPC UA events carry dynamic Message text (index 4, mapped) but no
|
||||
// static template description in the notification, so this stays empty.
|
||||
// OperatorUser — not available on the standard ConditionRefresh replay stream;
|
||||
// present on Acknowledge/Confirm method call results, but those do
|
||||
// not flow through the monitored-item subscription.
|
||||
// CurrentValue — the live process variable value is NOT a standard A&C event field;
|
||||
// it would require a separate data subscription on the source node.
|
||||
|
||||
ApplyServerSideTypeWhereClause(filter, conditionFilter);
|
||||
return filter;
|
||||
}
|
||||
|
||||
/// <summary>
|
||||
/// Attaches an OfType(-OR'd) WhereClause to <paramref name="filter"/> when every
|
||||
/// requested condition type maps to a standard A&C type NodeId; otherwise leaves
|
||||
/// the WhereClause empty (see <see cref="BuildAlarmEventFilter"/> rationale).
|
||||
/// </summary>
|
||||
private static void ApplyServerSideTypeWhereClause(EventFilter filter, AlarmConditionFilter conditionFilter)
|
||||
{
|
||||
if (conditionFilter.IsEmpty)
|
||||
return;
|
||||
|
||||
var typeIds = new List<NodeId>();
|
||||
foreach (var name in conditionFilter.Names)
|
||||
{
|
||||
if (!KnownConditionTypeIds.TryGetValue(name, out var id))
|
||||
return; // unmapped type → omit the WhereClause entirely (client gate covers it)
|
||||
typeIds.Add(id);
|
||||
}
|
||||
|
||||
if (typeIds.Count == 0)
|
||||
return;
|
||||
|
||||
var where = filter.WhereClause;
|
||||
if (typeIds.Count == 1)
|
||||
{
|
||||
where.Push(FilterOperator.OfType, typeIds[0]);
|
||||
return;
|
||||
}
|
||||
|
||||
// OR together each OfType element so an event of ANY listed type passes.
|
||||
var element = where.Push(FilterOperator.OfType, typeIds[0]);
|
||||
for (var i = 1; i < typeIds.Count; i++)
|
||||
{
|
||||
var next = where.Push(FilterOperator.OfType, typeIds[i]);
|
||||
element = where.Push(FilterOperator.Or, element, next);
|
||||
}
|
||||
}
|
||||
|
||||
private static SimpleAttributeOperand SelectField(NodeId typeDefinitionId, params string[] browse)
|
||||
{
|
||||
var path = new QualifiedNameCollection();
|
||||
@@ -359,7 +517,12 @@ public class RealOpcUaClient : IOpcUaClient
|
||||
return;
|
||||
}
|
||||
|
||||
var sourceName = fields[1].Value is NodeId ? (fields[2].Value as string ?? "") : (fields[2].Value as string ?? "");
|
||||
// Field layout (AlarmStateFields): [1]=SourceNode (NodeId), [2]=SourceName (string).
|
||||
// Prefer the human-readable SourceName; fall back to the SourceNode NodeId string
|
||||
// only when SourceName is absent/empty, so the condition still has a stable key.
|
||||
var sourceName = fields[2].Value as string;
|
||||
if (string.IsNullOrEmpty(sourceName))
|
||||
sourceName = (fields[1].Value as NodeId)?.ToString() ?? "";
|
||||
var conditionName = fields.Count > 11 ? fields[11].Value as string : null;
|
||||
var sourceObjectRef = sourceName;
|
||||
var sourceRef = string.IsNullOrEmpty(conditionName) ? sourceName : $"{sourceName}.{conditionName}";
|
||||
@@ -377,6 +540,25 @@ public class RealOpcUaClient : IOpcUaClient
|
||||
var shelve = OpcUaAlarmMapper.MapShelve(fields.Count > 10 ? (fields[10].Value as LocalizedText)?.Text : null);
|
||||
var comment = fields.Count > 12 ? (fields[12].Value as LocalizedText)?.Text ?? "" : "";
|
||||
|
||||
// Index 13: ActiveState/TransitionTime → OriginalRaiseTime (when active-state last
|
||||
// transitioned to TRUE). Absent on non-AlarmCondition events → guard + null fallback.
|
||||
DateTimeOffset? originalRaiseTime = null;
|
||||
if (fields.Count > 13 && fields[13].Value is DateTime activeTransitionTime)
|
||||
// OPC UA mandates UTC for DateTime fields; a TimeSpan.Zero offset treats an
|
||||
// Unspecified Kind as UTC (consistent with the Time→TransitionTime mapping above).
|
||||
originalRaiseTime = new DateTimeOffset(activeTransitionTime, TimeSpan.Zero);
|
||||
|
||||
// Indices 14–17: LimitAlarmType set-point thresholds (HighHighLimit/HighLimit/
|
||||
// LowLimit/LowLowLimit). Absent on non-limit alarm types → null when missing.
|
||||
// Pick the first non-null value in priority order (HiHi > Hi > Lo > LoLo) as a
|
||||
// display-only representative limit; the caller is responsible for interpreting
|
||||
// which limit is active using AlarmTypeName or ConditionName.
|
||||
var limitValue = OpcUaAlarmMapper.PickLimitValue(
|
||||
fields.Count > 14 ? fields[14].Value : null,
|
||||
fields.Count > 15 ? fields[15].Value : null,
|
||||
fields.Count > 16 ? fields[16].Value : null,
|
||||
fields.Count > 17 ? fields[17].Value : null);
|
||||
|
||||
var inRefresh = _alarmInRefresh.GetValueOrDefault(handle);
|
||||
var lastState = _alarmLastState.GetValueOrDefault(handle);
|
||||
var (prevActive, prevAcked) = lastState != null && lastState.TryGetValue(sourceRef, out var prev) ? prev : (false, true);
|
||||
@@ -389,18 +571,23 @@ public class RealOpcUaClient : IOpcUaClient
|
||||
onTransition(new NativeAlarmTransition(
|
||||
SourceReference: sourceRef,
|
||||
SourceObjectReference: sourceObjectRef,
|
||||
AlarmTypeName: eventType?.ToString() ?? "",
|
||||
// Resolve the event-type NodeId (e.g. "i=9341") to the friendly type name
|
||||
// the conditionFilter gate keys off (M2.4 / #8); NodeId-string for custom types.
|
||||
AlarmTypeName: ResolveAlarmTypeName(eventType),
|
||||
Kind: kind,
|
||||
Condition: OpcUaAlarmMapper.BuildCondition(active, acked, confirmed, shelve, suppressed, severity),
|
||||
// UNAVAILABLE via standard OPC UA A&C event fields — see BuildAlarmEventFilter comments.
|
||||
Category: "",
|
||||
Description: "",
|
||||
Message: message,
|
||||
// UNAVAILABLE: OperatorUser not on refresh stream — see BuildAlarmEventFilter comments.
|
||||
OperatorUser: "",
|
||||
OperatorComment: comment,
|
||||
OriginalRaiseTime: null,
|
||||
OriginalRaiseTime: originalRaiseTime,
|
||||
TransitionTime: time,
|
||||
// UNAVAILABLE: CurrentValue not a standard A&C event field — see BuildAlarmEventFilter.
|
||||
CurrentValue: "",
|
||||
LimitValue: ""));
|
||||
LimitValue: limitValue));
|
||||
}
|
||||
|
||||
private static NativeAlarmTransition SnapshotComplete() => new(
|
||||
|
||||
@@ -0,0 +1,78 @@
|
||||
using ZB.MOM.WW.ScadaBridge.Commons.Types.Alarms;
|
||||
using ZB.MOM.WW.ScadaBridge.Commons.Types.Enums;
|
||||
|
||||
namespace ZB.MOM.WW.ScadaBridge.DataConnectionLayer;
|
||||
|
||||
/// <summary>
|
||||
/// Parsed native-alarm condition filter (M2.4 / #8).
|
||||
///
|
||||
/// <para>
|
||||
/// A source's <c>conditionFilter</c> is a comma-separated, case-insensitive list
|
||||
/// of alarm/condition <em>type names</em>, matched against
|
||||
/// <see cref="NativeAlarmTransition.AlarmTypeName"/>. A <c>null</c>, blank, or
|
||||
/// all-empty list means "mirror every condition" (the historical behaviour),
|
||||
/// represented here by <see cref="IsEmpty"/>.
|
||||
/// </para>
|
||||
///
|
||||
/// <para>
|
||||
/// This is the authoritative <em>client-side</em> gate consulted in the
|
||||
/// <c>DataConnectionActor</c> routing path, so it applies uniformly across OPC UA
|
||||
/// (whose server-side <c>WhereClause</c> is only a bandwidth optimisation) and the
|
||||
/// MxGateway (whose single gateway-wide feed has no server-side filter at all).
|
||||
/// Parse once at subscribe time; <see cref="IsAllowed"/> is the hot-path check.
|
||||
/// </para>
|
||||
/// </summary>
|
||||
public sealed class AlarmConditionFilter
|
||||
{
|
||||
/// <summary>The shared allow-all instance (empty filter set).</summary>
|
||||
public static readonly AlarmConditionFilter AllowAll = new(new HashSet<string>(StringComparer.OrdinalIgnoreCase));
|
||||
|
||||
private readonly HashSet<string> _names;
|
||||
|
||||
private AlarmConditionFilter(HashSet<string> names) => _names = names;
|
||||
|
||||
/// <summary><c>true</c> when no type names are configured — every condition is allowed.</summary>
|
||||
public bool IsEmpty => _names.Count == 0;
|
||||
|
||||
/// <summary>The normalized (trimmed) type names, for the OPC UA server-side WhereClause optimisation.</summary>
|
||||
public IReadOnlyCollection<string> Names => _names;
|
||||
|
||||
/// <summary>
|
||||
/// Parses a raw <c>conditionFilter</c> string into a normalized, case-insensitive
|
||||
/// type-name set. <c>null</c>/blank/all-empty input yields an empty (allow-all) filter.
|
||||
/// </summary>
|
||||
/// <param name="conditionFilter">The raw comma-separated filter string, or <c>null</c>.</param>
|
||||
/// <returns>A parsed <see cref="AlarmConditionFilter"/>; never <c>null</c>.</returns>
|
||||
public static AlarmConditionFilter Parse(string? conditionFilter)
|
||||
{
|
||||
if (string.IsNullOrWhiteSpace(conditionFilter))
|
||||
return AllowAll;
|
||||
|
||||
var names = new HashSet<string>(StringComparer.OrdinalIgnoreCase);
|
||||
foreach (var raw in conditionFilter.Split(',', StringSplitOptions.RemoveEmptyEntries | StringSplitOptions.TrimEntries))
|
||||
names.Add(raw);
|
||||
|
||||
return names.Count == 0 ? AllowAll : new AlarmConditionFilter(names);
|
||||
}
|
||||
|
||||
/// <summary>
|
||||
/// Returns <c>true</c> when <paramref name="transition"/> should be delivered:
|
||||
/// the filter is empty (allow all), the transition is a framing sentinel
|
||||
/// (<see cref="AlarmTransitionKind.SnapshotComplete"/>, which carries no condition
|
||||
/// type and must never be swallowed or the snapshot swap never completes), or its
|
||||
/// <see cref="NativeAlarmTransition.AlarmTypeName"/> is in the configured set.
|
||||
/// </summary>
|
||||
/// <param name="transition">The protocol-neutral transition to test.</param>
|
||||
/// <returns><c>true</c> to deliver the transition; <c>false</c> to drop it.</returns>
|
||||
public bool IsAllowed(NativeAlarmTransition transition)
|
||||
{
|
||||
if (_names.Count == 0)
|
||||
return true;
|
||||
|
||||
// SnapshotComplete is pure framing (no condition payload) — never filter it.
|
||||
if (transition.Kind == AlarmTransitionKind.SnapshotComplete)
|
||||
return true;
|
||||
|
||||
return _names.Contains(transition.AlarmTypeName);
|
||||
}
|
||||
}
|
||||
+7
@@ -19,6 +19,13 @@
|
||||
<PackageReference Include="ZB.MOM.WW.MxGateway.Client" />
|
||||
</ItemGroup>
|
||||
|
||||
<ItemGroup>
|
||||
<!-- Exposes internal alarm-filter shaping (RealOpcUaClient.BuildAlarmEventFilter)
|
||||
to the test assembly so the server-side WhereClause can be unit-tested
|
||||
without a live OPC UA server (M2.4 / #8). -->
|
||||
<InternalsVisibleTo Include="ZB.MOM.WW.ScadaBridge.DataConnectionLayer.Tests" />
|
||||
</ItemGroup>
|
||||
|
||||
<ItemGroup>
|
||||
<ProjectReference Include="../ZB.MOM.WW.ScadaBridge.Commons/ZB.MOM.WW.ScadaBridge.Commons.csproj" />
|
||||
<ProjectReference Include="../ZB.MOM.WW.ScadaBridge.HealthMonitoring/ZB.MOM.WW.ScadaBridge.HealthMonitoring.csproj" />
|
||||
|
||||
@@ -1,4 +1,5 @@
|
||||
using ZB.MOM.WW.ScadaBridge.Commons.Entities.Sites;
|
||||
using ZB.MOM.WW.ScadaBridge.Commons.Interfaces.Protocol;
|
||||
using ZB.MOM.WW.ScadaBridge.Commons.Interfaces.Repositories;
|
||||
using ZB.MOM.WW.ScadaBridge.Commons.Types;
|
||||
using ZB.MOM.WW.ScadaBridge.Commons.Types.Flattening;
|
||||
@@ -111,8 +112,41 @@ public class FlatteningPipeline : IFlatteningPipeline
|
||||
ReturnDefinition = s.ReturnDefinition
|
||||
}).ToList();
|
||||
|
||||
// Validate
|
||||
var validation = _validationService.Validate(config, resolvedSharedScripts);
|
||||
// Compute the alarm-capable connection-name set so the semantic validator
|
||||
// can gate native-alarm-source bindings. "Alarm-capable" matches the DCL
|
||||
// runtime decision (DataConnectionActor: _adapter is IAlarmSubscribableConnection);
|
||||
// here we filter connections by alarm-capable protocol, then collect their names.
|
||||
//
|
||||
// StringComparer.Ordinal is intentional: connection names are stored and
|
||||
// matched as authored throughout the pipeline (all other name-keyed
|
||||
// dictionaries in FlatteningService and SemanticValidator use the same
|
||||
// case-sensitive semantics). OrdinalIgnoreCase would be inconsistent with
|
||||
// the rest of the binding-resolution path.
|
||||
var alarmCapableConnectionNames = dataConnections.Values
|
||||
.Where(c => AlarmCapableProtocols.IsAlarmCapable(c.Protocol))
|
||||
.Select(c => c.Name)
|
||||
.ToHashSet(StringComparer.Ordinal);
|
||||
|
||||
// M2.8 (#23): the set of data-connection names that actually exist on the
|
||||
// target site, used to verify each bound connection resolves to a real site
|
||||
// connection. Same StringComparer.Ordinal as the rest of the binding-resolution
|
||||
// path (connection names are matched as-authored throughout the pipeline).
|
||||
var siteConnectionNames = dataConnections.Values
|
||||
.Select(c => c.Name)
|
||||
.ToHashSet(StringComparer.Ordinal);
|
||||
|
||||
// Validate. This is the deploy-gating path, so connection-binding completeness
|
||||
// is enforced as an Error (enforceConnectionBindings: true): a data-sourced
|
||||
// attribute with no binding — or one bound to a connection that no longer exists
|
||||
// on the site — blocks the deployment. (The template DESIGN-TIME validate path in
|
||||
// ManagementActor leaves this non-blocking by NOT enforcing, since bindings are
|
||||
// set later at instance/deploy time.)
|
||||
var validation = _validationService.Validate(
|
||||
config,
|
||||
resolvedSharedScripts,
|
||||
alarmCapableConnectionNames,
|
||||
enforceConnectionBindings: true,
|
||||
siteConnectionNames: siteConnectionNames);
|
||||
|
||||
// Compute revision hash
|
||||
var hash = _revisionHashService.ComputeHash(config);
|
||||
|
||||
@@ -37,6 +37,14 @@ public static class StateTransitionValidator
|
||||
/// <summary>Returns true when a delete operation is allowed from the given state.</summary>
|
||||
/// <param name="currentState">The current instance state.</param>
|
||||
/// <returns><see langword="true"/> if delete is permitted; otherwise <see langword="false"/>.</returns>
|
||||
/// <remarks>
|
||||
/// Delete is allowed from <see cref="InstanceState.NotDeployed"/> by design: an
|
||||
/// undeployed instance would otherwise linger as an unremovable orphan record.
|
||||
/// Delete from <c>NotDeployed</c> is a central-side record cleanup (no live site
|
||||
/// config to tear down). This matches the state-transition matrix in
|
||||
/// Component-DeploymentManager.md ("Delete from Not deployed = Yes") — reconciled
|
||||
/// in M2.17 (#31); the deliberate behaviour was introduced in commit 1d5465f3.
|
||||
/// </remarks>
|
||||
public static bool CanDelete(InstanceState currentState) =>
|
||||
currentState is InstanceState.NotDeployed or InstanceState.Enabled or InstanceState.Disabled;
|
||||
|
||||
|
||||
@@ -75,7 +75,7 @@ public class DatabaseGateway : IDatabaseGateway
|
||||
new SqlConnection(connectionString);
|
||||
|
||||
/// <inheritdoc />
|
||||
public async Task CachedWriteAsync(
|
||||
public async Task<ExternalCallResult> CachedWriteAsync(
|
||||
string connectionName,
|
||||
string sql,
|
||||
IReadOnlyDictionary<string, object?>? parameters = null,
|
||||
@@ -97,6 +97,44 @@ public class DatabaseGateway : IDatabaseGateway
|
||||
throw new InvalidOperationException("Store-and-forward service not available for cached writes");
|
||||
}
|
||||
|
||||
// M2.3 (#7): attempt the write IMMEDIATELY and classify the outcome,
|
||||
// mirroring ExternalSystemClient.CachedCallAsync. The pre-M2.3 behaviour
|
||||
// enqueued every write unconditionally and the S&F retry sweep then
|
||||
// retried ALL failures forever — a permanent SQL error (constraint,
|
||||
// syntax, permission) was never returned to the script and spun in the
|
||||
// buffer indefinitely. Now:
|
||||
// * success -> Delivered, NOT buffered;
|
||||
// * PermanentDatabaseException -> Failed synchronously, NOT buffered;
|
||||
// * TransientDatabaseException -> buffered to S&F for retry.
|
||||
try
|
||||
{
|
||||
await ExecuteWriteAsync(
|
||||
connectionName, definition.ConnectionString, sql, parameters ?? EmptyParameters, cancellationToken)
|
||||
.ConfigureAwait(false);
|
||||
|
||||
// Immediate success — the write is done; do not buffer.
|
||||
return new ExternalCallResult(Success: true, ResponseJson: null, ErrorMessage: null, WasBuffered: false);
|
||||
}
|
||||
catch (PermanentDatabaseException ex)
|
||||
{
|
||||
// Permanent failures are returned to the script and never buffered —
|
||||
// mirrors the PermanentExternalSystemException branch on the API path.
|
||||
_logger.LogWarning(
|
||||
ex,
|
||||
"CachedWrite to '{Connection}' failed permanently (SQL error {Number}); returning Failed without buffering.",
|
||||
connectionName, ex.SqlErrorNumber);
|
||||
return new ExternalCallResult(
|
||||
Success: false, ResponseJson: null, ErrorMessage: $"Permanent database error: {ex.Message}", WasBuffered: false);
|
||||
}
|
||||
catch (TransientDatabaseException ex)
|
||||
{
|
||||
// Transient failure — hand to S&F so the retry sweep delivers it.
|
||||
_logger.LogDebug(
|
||||
ex,
|
||||
"CachedWrite to '{Connection}' failed transiently (SQL error {Number}); buffering for retry.",
|
||||
connectionName, ex.SqlErrorNumber);
|
||||
}
|
||||
|
||||
var payload = JsonSerializer.Serialize(new
|
||||
{
|
||||
ConnectionName = connectionName,
|
||||
@@ -119,6 +157,12 @@ public class DatabaseGateway : IDatabaseGateway
|
||||
originInstanceName,
|
||||
definition.MaxRetries > 0 ? definition.MaxRetries : null,
|
||||
definition.RetryDelay > TimeSpan.Zero ? definition.RetryDelay : null,
|
||||
// M2.3 (#7): attemptImmediateDelivery: false — this method already
|
||||
// made the write attempt above (the transient-classified failure is
|
||||
// exactly why we are buffering). Letting EnqueueAsync re-invoke the
|
||||
// delivery handler would execute the same write a second time —
|
||||
// mirrors ExternalSystemClient.CachedCallAsync.
|
||||
attemptImmediateDelivery: false,
|
||||
// Audit Log #23 (M3): pin the S&F message id to the
|
||||
// TrackedOperationId so the retry loop (Bundle E Tasks E4/E5) can
|
||||
// read it back via StoreAndForwardMessage.Id and emit per-attempt +
|
||||
@@ -136,17 +180,29 @@ public class DatabaseGateway : IDatabaseGateway
|
||||
// retry-loop cached-write audit rows correlate back to the
|
||||
// cross-execution chain. Null for a non-routed run.
|
||||
parentExecutionId: parentExecutionId);
|
||||
|
||||
// Buffered for retry — mirrors the API path's WasBuffered=true result.
|
||||
return new ExternalCallResult(Success: true, ResponseJson: null, ErrorMessage: null, WasBuffered: true);
|
||||
}
|
||||
|
||||
/// <summary>
|
||||
/// WP-9/10: Delivers a buffered CachedDbWrite during a store-and-forward retry
|
||||
/// sweep — executes the SQL against the named connection. Returns true on
|
||||
/// success, false if the connection no longer exists (the message is parked);
|
||||
/// throws on any execution error so the engine retries.
|
||||
/// sweep — executes the SQL against the named connection.
|
||||
/// </summary>
|
||||
/// <remarks>
|
||||
/// M2.3 (#7): the outcome is classified, mirroring
|
||||
/// <see cref="ExternalSystemClient.DeliverBufferedAsync"/>. Returns
|
||||
/// <c>false</c> — so the S&F engine PARKS the message — when the
|
||||
/// connection no longer exists, the payload is unreadable, or the SQL fails
|
||||
/// with a PERMANENT error (constraint / syntax / permission). A TRANSIENT SQL
|
||||
/// error (<see cref="TransientDatabaseException"/>) propagates so the engine
|
||||
/// retries. The pre-M2.3 code rethrew on ANY SQL error, so a permanent
|
||||
/// failure on the retry path looped forever.
|
||||
/// </remarks>
|
||||
/// <param name="message">The buffered store-and-forward message to deliver.</param>
|
||||
/// <param name="cancellationToken">Cancellation token for the delivery operation.</param>
|
||||
/// <returns>A task that resolves to <c>true</c> on success, or <c>false</c> if the connection no longer exists.</returns>
|
||||
/// <returns>A task that resolves to <c>true</c> on success, or <c>false</c> when the message must be parked.</returns>
|
||||
/// <exception cref="TransientDatabaseException">Thrown on a transient SQL failure so the engine retries.</exception>
|
||||
public async Task<bool> DeliverBufferedAsync(
|
||||
StoreAndForwardMessage message, CancellationToken cancellationToken = default)
|
||||
{
|
||||
@@ -185,22 +241,152 @@ public class DatabaseGateway : IDatabaseGateway
|
||||
return false;
|
||||
}
|
||||
|
||||
await using var connection = new SqlConnection(definition.ConnectionString);
|
||||
await connection.OpenAsync(cancellationToken);
|
||||
using var command = connection.CreateCommand();
|
||||
command.CommandText = payload.Sql;
|
||||
if (payload.Parameters != null)
|
||||
// Materialise the buffered JsonElement parameters into CLR values once,
|
||||
// then run through the shared ExecuteWriteAsync seam so both the
|
||||
// immediate-attempt path and this retry path classify SqlException the
|
||||
// same way.
|
||||
IReadOnlyDictionary<string, object?> materialisedParameters =
|
||||
payload.Parameters == null
|
||||
? EmptyParameters
|
||||
: payload.Parameters.ToDictionary(
|
||||
kv => kv.Key, kv => (object?)JsonElementToParameterValue(kv.Value));
|
||||
|
||||
try
|
||||
{
|
||||
foreach (var (key, value) in payload.Parameters)
|
||||
{
|
||||
var parameter = command.CreateParameter();
|
||||
parameter.ParameterName = key.StartsWith('@') ? key : "@" + key;
|
||||
parameter.Value = JsonElementToParameterValue(value);
|
||||
command.Parameters.Add(parameter);
|
||||
}
|
||||
await ExecuteWriteAsync(
|
||||
payload.ConnectionName, definition.ConnectionString, payload.Sql, materialisedParameters, cancellationToken)
|
||||
.ConfigureAwait(false);
|
||||
return true;
|
||||
}
|
||||
await command.ExecuteNonQueryAsync(cancellationToken);
|
||||
return true;
|
||||
catch (PermanentDatabaseException ex)
|
||||
{
|
||||
// Permanent — parking is correct; retrying the identical statement
|
||||
// cannot succeed. Mirrors ExternalSystemClient.DeliverBufferedAsync
|
||||
// returning false on PermanentExternalSystemException.
|
||||
_logger.LogError(
|
||||
ex,
|
||||
"Buffered DB write to '{Connection}' failed permanently (SQL error {Number}); parking.",
|
||||
payload.ConnectionName, ex.SqlErrorNumber);
|
||||
return false;
|
||||
}
|
||||
// TransientDatabaseException propagates — the S&F engine retries.
|
||||
}
|
||||
|
||||
/// <summary>
|
||||
/// Reusable empty parameter map so the no-parameter paths do not allocate a
|
||||
/// fresh dictionary each call.
|
||||
/// </summary>
|
||||
private static readonly IReadOnlyDictionary<string, object?> EmptyParameters =
|
||||
new Dictionary<string, object?>();
|
||||
|
||||
/// <summary>
|
||||
/// M2.3 (#7): executes a parameterised SQL write against the given connection
|
||||
/// string and classifies the outcome into
|
||||
/// <see cref="TransientDatabaseException"/> / <see cref="PermanentDatabaseException"/>,
|
||||
/// mirroring the ordered catches of
|
||||
/// <see cref="ExternalSystemClient.InvokeHttpAsync"/> on the API path:
|
||||
/// caller-requested cancellation propagates unchanged; a <see cref="SqlException"/>
|
||||
/// is classified by error number via <see cref="SqlErrorClassifier"/>; a
|
||||
/// non-<see cref="SqlException"/> transport/connection outage is classified
|
||||
/// transient via <see cref="SqlErrorClassifier.IsTransient(System.Exception)"/>;
|
||||
/// genuinely-unexpected exceptions propagate. This is the single classification
|
||||
/// seam shared by the immediate <see cref="CachedWriteAsync"/> attempt and the
|
||||
/// <see cref="DeliverBufferedAsync"/> retry path. Marked <c>internal virtual</c>
|
||||
/// so tests can substitute already-classified outcomes; the raw I/O lives in
|
||||
/// the inner <see cref="RunSqlAsync"/> seam so tests can also drive raw outage
|
||||
/// exceptions through this classification (without fabricating a
|
||||
/// <see cref="SqlException"/>, which has no public constructor).
|
||||
/// </summary>
|
||||
/// <param name="connectionName">The human-readable connection name, used only for the classified error message (never the connection string — that would leak credentials into logs / script-visible errors).</param>
|
||||
/// <param name="connectionString">The ADO.NET connection string to write through.</param>
|
||||
/// <param name="sql">The SQL statement to execute.</param>
|
||||
/// <param name="parameters">Materialised CLR parameter values (may be empty).</param>
|
||||
/// <param name="cancellationToken">Cancellation token for the write.</param>
|
||||
/// <returns>A task that completes when the write succeeds.</returns>
|
||||
/// <exception cref="OperationCanceledException">Rethrown unchanged when the caller's <paramref name="cancellationToken"/> requested cancellation.</exception>
|
||||
/// <exception cref="TransientDatabaseException">Thrown for a transient SQL error number or a non-Sql transport/connection outage.</exception>
|
||||
/// <exception cref="PermanentDatabaseException">Thrown for a permanent (or unknown) SQL error number.</exception>
|
||||
internal virtual async Task ExecuteWriteAsync(
|
||||
string connectionName,
|
||||
string connectionString,
|
||||
string sql,
|
||||
IReadOnlyDictionary<string, object?> parameters,
|
||||
CancellationToken cancellationToken)
|
||||
{
|
||||
// M2.3 (#7) code-review fix: the catch ordering MIRRORS
|
||||
// ExternalSystemClient.InvokeHttpAsync exactly so the SQL path classifies
|
||||
// a live outage the same way the HTTP path does:
|
||||
// 1. caller-requested cancellation propagates UNCHANGED (never a "DB error");
|
||||
// 2. a SqlException is classified by error number (transient/permanent);
|
||||
// 3. a NON-SqlException transport/connection failure (InvalidOperationException
|
||||
// "connection not open", IOException, SocketException, TimeoutException,
|
||||
// a non-Sql DbException, …) is TRANSIENT — buffered + retried, because a
|
||||
// retry can succeed once the server is reachable. The pre-fix code only
|
||||
// caught SqlException, so these escaped unclassified and crashed the
|
||||
// Script Execution Actor instead of buffering;
|
||||
// 4. genuinely-unexpected exceptions (e.g. an authoring ArgumentException)
|
||||
// propagate — same as the HTTP path lets unexpected exceptions escape.
|
||||
try
|
||||
{
|
||||
await RunSqlAsync(connectionString, sql, parameters, cancellationToken).ConfigureAwait(false);
|
||||
}
|
||||
catch (OperationCanceledException) when (cancellationToken.IsCancellationRequested)
|
||||
{
|
||||
// [2] The caller asked to abandon the work — propagate the cancellation
|
||||
// unchanged; it must never be reclassified as a transient DB error.
|
||||
throw;
|
||||
}
|
||||
catch (SqlException ex)
|
||||
{
|
||||
// Classify by SqlException.Number and rethrow as the strongly-typed
|
||||
// transient / permanent failure the callers branch on. The context
|
||||
// is the connection NAME, never the connection string.
|
||||
throw SqlErrorClassifier.Throw(connectionName, ex);
|
||||
}
|
||||
catch (Exception ex) when (SqlErrorClassifier.IsTransient(ex))
|
||||
{
|
||||
// [1] A live outage that did not surface as a SqlException — treat as
|
||||
// transient so the caller buffers + retries. The message uses the
|
||||
// connection NAME, never the connection string (credential safety).
|
||||
throw new TransientDatabaseException(
|
||||
$"Transient database error on {connectionName}: {ex.Message}",
|
||||
errorNumber: null,
|
||||
ex);
|
||||
}
|
||||
}
|
||||
|
||||
/// <summary>
|
||||
/// M2.3 (#7): the raw ADO.NET write — opens the connection, builds the
|
||||
/// command, and executes it. Marked <c>internal virtual</c> so tests can throw
|
||||
/// RAW outage-shaped exceptions (e.g. <see cref="InvalidOperationException"/>,
|
||||
/// <see cref="System.Net.Sockets.SocketException"/>) through the PRODUCTION
|
||||
/// classification in <see cref="ExecuteWriteAsync"/>. This is the SQL parallel
|
||||
/// of <c>client.SendAsync</c> inside <see cref="ExternalSystemClient.InvokeHttpAsync"/>:
|
||||
/// the actual I/O, wrapped by the ordered classification catches in the caller.
|
||||
/// </summary>
|
||||
/// <param name="connectionString">The ADO.NET connection string to write through.</param>
|
||||
/// <param name="sql">The SQL statement to execute.</param>
|
||||
/// <param name="parameters">Materialised CLR parameter values (may be empty).</param>
|
||||
/// <param name="cancellationToken">Cancellation token for the write.</param>
|
||||
/// <returns>A task that completes when the write succeeds.</returns>
|
||||
internal virtual async Task RunSqlAsync(
|
||||
string connectionString,
|
||||
string sql,
|
||||
IReadOnlyDictionary<string, object?> parameters,
|
||||
CancellationToken cancellationToken)
|
||||
{
|
||||
await using var connection = new SqlConnection(connectionString);
|
||||
await connection.OpenAsync(cancellationToken).ConfigureAwait(false);
|
||||
using var command = connection.CreateCommand();
|
||||
command.CommandText = sql;
|
||||
foreach (var (key, value) in parameters)
|
||||
{
|
||||
var parameter = command.CreateParameter();
|
||||
parameter.ParameterName = key.StartsWith('@') ? key : "@" + key;
|
||||
parameter.Value = value ?? DBNull.Value;
|
||||
command.Parameters.Add(parameter);
|
||||
}
|
||||
await command.ExecuteNonQueryAsync(cancellationToken).ConfigureAwait(false);
|
||||
}
|
||||
|
||||
// ExternalSystemGateway-020: a JSON number that does not fit in Int64 must
|
||||
|
||||
@@ -0,0 +1,217 @@
|
||||
using System.Data.Common;
|
||||
using System.IO;
|
||||
using System.Net.Sockets;
|
||||
using Microsoft.Data.SqlClient;
|
||||
|
||||
namespace ZB.MOM.WW.ScadaBridge.ExternalSystemGateway;
|
||||
|
||||
/// <summary>
|
||||
/// M2.3 (#7): classifies a SQL Server failure as transient (a brief wait /
|
||||
/// retry may succeed — buffer to store-and-forward) or permanent (the identical
|
||||
/// statement cannot succeed — return to the script / park the buffered message).
|
||||
/// </summary>
|
||||
/// <remarks>
|
||||
/// <para>
|
||||
/// This is the database-side parallel of <see cref="ErrorClassifier"/> (the
|
||||
/// HTTP path). The two are kept separate because the inputs differ: HTTP keys
|
||||
/// off status codes / exception types, SQL keys off
|
||||
/// <see cref="SqlException.Number"/>.
|
||||
/// </para>
|
||||
/// <para>
|
||||
/// <b>Transient set.</b> Only connection-loss, timeout, deadlock, and Azure SQL
|
||||
/// throttle/availability error numbers are transient — failures whose cause is
|
||||
/// external to the statement and may clear on its own:
|
||||
/// <list type="bullet">
|
||||
/// <item><c>-2</c> — query / command timeout expired.</item>
|
||||
/// <item><c>-1</c> — a connection-level error (general SqlClient connection failure).</item>
|
||||
/// <item><c>2</c> — SQL Server / network instance not found or not accessible.</item>
|
||||
/// <item><c>53</c> — network path to the server was not found.</item>
|
||||
/// <item><c>64</c> — connection terminated mid-session (transport error).</item>
|
||||
/// <item><c>233</c> — no process on the other end of the named pipe.</item>
|
||||
/// <item><c>1205</c> — the session was chosen as a deadlock victim.</item>
|
||||
/// <item><c>10053</c> — transport-level abort (software caused connection abort).</item>
|
||||
/// <item><c>10054</c> — connection reset by peer.</item>
|
||||
/// <item><c>10060</c> — connection attempt timed out.</item>
|
||||
/// <item><c>40197</c> — Azure SQL service error processing the request; retry.</item>
|
||||
/// <item><c>40501</c> — Azure SQL service is busy.</item>
|
||||
/// <item><c>40613</c> — Azure SQL database is currently unavailable.</item>
|
||||
/// <item><c>49918</c> / <c>49919</c> / <c>49920</c> — Azure SQL throttling (too many requests / operations).</item>
|
||||
/// </list>
|
||||
/// </para>
|
||||
/// <para>
|
||||
/// <b>Everything else is permanent.</b> Constraint violations (547, 2627, 2601),
|
||||
/// syntax errors (102, 156, 207, 208), and permission errors (229, 230, 262) are
|
||||
/// the obvious permanent cases, but the policy is broader: <b>any error number not
|
||||
/// in the transient set — including unknown / undocumented / ambiguous numbers —
|
||||
/// is treated as permanent.</b> Fail-fast is the safer default: silently
|
||||
/// retrying an unrecognised error forever (the pre-M2.3 behaviour) hides
|
||||
/// authoring bugs and can replay duplicate side effects. A genuinely transient
|
||||
/// number we have not enumerated will, at worst, surface to the script as a
|
||||
/// permanent failure — a loud, fixable outcome — rather than spin in an
|
||||
/// unbounded retry loop.
|
||||
/// </para>
|
||||
/// </remarks>
|
||||
public static class SqlErrorClassifier
|
||||
{
|
||||
/// <summary>
|
||||
/// The complete set of SQL Server error numbers treated as transient. See the
|
||||
/// type-level remarks for the per-number rationale. Anything outside this set
|
||||
/// is permanent.
|
||||
/// </summary>
|
||||
private static readonly HashSet<int> TransientErrorNumbers = new()
|
||||
{
|
||||
-2, -1, 2, 53, 64, 233, 1205,
|
||||
10053, 10054, 10060,
|
||||
40197, 40501, 40613,
|
||||
49918, 49919, 49920,
|
||||
};
|
||||
|
||||
/// <summary>
|
||||
/// Determines whether a SQL Server error number represents a transient
|
||||
/// failure. Unknown / undocumented numbers default to permanent
|
||||
/// (<see langword="false"/>) — see the type-level remarks.
|
||||
/// </summary>
|
||||
/// <param name="errorNumber">The SQL Server error number (e.g. <see cref="SqlException.Number"/>).</param>
|
||||
/// <returns><see langword="true"/> if the number is in the transient set; otherwise <see langword="false"/>.</returns>
|
||||
public static bool IsTransient(int errorNumber) => TransientErrorNumbers.Contains(errorNumber);
|
||||
|
||||
/// <summary>
|
||||
/// Determines whether a <see cref="SqlException"/> represents a transient
|
||||
/// failure by classifying its top-level <see cref="SqlException.Number"/>.
|
||||
/// </summary>
|
||||
/// <param name="exception">The SQL exception to classify.</param>
|
||||
/// <returns><see langword="true"/> if the exception's error number is transient; otherwise <see langword="false"/>.</returns>
|
||||
public static bool IsTransient(SqlException exception)
|
||||
{
|
||||
ArgumentNullException.ThrowIfNull(exception);
|
||||
return IsTransient(exception.Number);
|
||||
}
|
||||
|
||||
/// <summary>
|
||||
/// Determines whether an arbitrary <see cref="Exception"/> represents a
|
||||
/// transient database failure — the SQL-path parallel of
|
||||
/// <see cref="ErrorClassifier.IsTransient(System.Exception)"/> on the HTTP path.
|
||||
/// </summary>
|
||||
/// <remarks>
|
||||
/// <para>
|
||||
/// A live DB outage does not always surface as a <see cref="SqlException"/>:
|
||||
/// once the underlying connection / socket is torn down, the driver raises
|
||||
/// transport-level exceptions instead. These are <b>retryable</b> — a retry
|
||||
/// can succeed once the server is reachable again — so they are classified
|
||||
/// transient (buffered to store-and-forward) rather than escaping unclassified
|
||||
/// to crash the calling Script Execution Actor. The transient set:
|
||||
/// </para>
|
||||
/// <list type="bullet">
|
||||
/// <item><see cref="InvalidOperationException"/> — connection-state error (e.g. "the connection is not open" / pooled connection broken).</item>
|
||||
/// <item><see cref="IOException"/> — transport read/write failure mid-session.</item>
|
||||
/// <item><see cref="SocketException"/> — TCP-level failure (connection refused/reset/timed out).</item>
|
||||
/// <item><see cref="TimeoutException"/> — command / connection timeout surfaced as a CLR <see cref="TimeoutException"/>.</item>
|
||||
/// <item><see cref="TaskCanceledException"/> — driver-level cancellation/timeout NOT tied to a caller token (the caller-token case is handled before classification — see the gateway's ordered catches).</item>
|
||||
/// <item>Any <see cref="DbException"/> that is NOT a <see cref="SqlException"/> — a provider/driver transport error (a real <see cref="SqlException"/> is classified by error number via the overloads above, never here).</item>
|
||||
/// </list>
|
||||
/// <para>
|
||||
/// <b>Everything else is NOT transient</b> and must propagate, exactly as the
|
||||
/// HTTP path lets genuinely-unexpected exceptions escape past its
|
||||
/// <c>catch (Exception ex) when (ErrorClassifier.IsTransient(ex))</c> filter.
|
||||
/// Authoring bugs (<see cref="ArgumentException"/>, <see cref="NullReferenceException"/>,
|
||||
/// etc.) are loud, fixable failures — silently buffering and retrying them
|
||||
/// forever would hide the bug.
|
||||
/// </para>
|
||||
/// </remarks>
|
||||
/// <param name="exception">The exception to classify.</param>
|
||||
/// <returns><see langword="true"/> for a transport/connection/timeout/driver exception; otherwise <see langword="false"/>.</returns>
|
||||
public static bool IsTransient(Exception exception)
|
||||
{
|
||||
ArgumentNullException.ThrowIfNull(exception);
|
||||
|
||||
// A real SqlException is classified by its error number (the overloads
|
||||
// above), never by type — fall back to the number-based policy so an
|
||||
// unknown SqlException stays permanent (fail-fast) rather than being
|
||||
// swept up as transient by the DbException catch-all below.
|
||||
if (exception is SqlException sql)
|
||||
{
|
||||
return IsTransient(sql);
|
||||
}
|
||||
|
||||
return exception is InvalidOperationException
|
||||
or IOException
|
||||
or SocketException
|
||||
or TimeoutException
|
||||
or TaskCanceledException
|
||||
or DbException; // any non-SqlException DbException (SqlException handled above)
|
||||
}
|
||||
|
||||
/// <summary>
|
||||
/// Classifies a <see cref="SqlException"/> and rethrows it as the matching
|
||||
/// strongly-typed failure: <see cref="TransientDatabaseException"/> for a
|
||||
/// transient error number, <see cref="PermanentDatabaseException"/> otherwise.
|
||||
/// Mirrors <see cref="ErrorClassifier.AsTransient(string, System.Exception?)"/>
|
||||
/// + the throw of <see cref="PermanentExternalSystemException"/> on the HTTP
|
||||
/// path — the callers then branch on the typed exception rather than on the
|
||||
/// raw <see cref="SqlException"/>.
|
||||
/// </summary>
|
||||
/// <param name="context">A short human-readable description of the failing operation (e.g. the connection name).</param>
|
||||
/// <param name="exception">The SQL exception to classify and wrap.</param>
|
||||
/// <returns>This method never returns normally — it always throws.</returns>
|
||||
/// <exception cref="TransientDatabaseException">Thrown when the error number is transient.</exception>
|
||||
/// <exception cref="PermanentDatabaseException">Thrown when the error number is permanent (the default).</exception>
|
||||
public static Exception Throw(string context, SqlException exception)
|
||||
{
|
||||
ArgumentNullException.ThrowIfNull(exception);
|
||||
|
||||
if (IsTransient(exception))
|
||||
{
|
||||
throw new TransientDatabaseException(
|
||||
$"Transient SQL error {exception.Number} on {context}: {exception.Message}",
|
||||
exception.Number,
|
||||
exception);
|
||||
}
|
||||
|
||||
throw new PermanentDatabaseException(
|
||||
$"Permanent SQL error {exception.Number} on {context}: {exception.Message}",
|
||||
exception.Number,
|
||||
exception);
|
||||
}
|
||||
}
|
||||
|
||||
/// <summary>
|
||||
/// Signals a transient database failure suitable for store-and-forward retry —
|
||||
/// the SQL-path parallel of <see cref="TransientExternalSystemException"/>.
|
||||
/// </summary>
|
||||
public class TransientDatabaseException : Exception
|
||||
{
|
||||
/// <summary>Gets the SQL Server error number that caused the failure, if known.</summary>
|
||||
public int? SqlErrorNumber { get; }
|
||||
|
||||
/// <summary>Initializes a new <see cref="TransientDatabaseException"/>.</summary>
|
||||
/// <param name="message">The error message.</param>
|
||||
/// <param name="errorNumber">The SQL Server error number, if available.</param>
|
||||
/// <param name="innerException">Optional inner exception (typically the original <see cref="SqlException"/>).</param>
|
||||
public TransientDatabaseException(string message, int? errorNumber = null, Exception? innerException = null)
|
||||
: base(message, innerException)
|
||||
{
|
||||
SqlErrorNumber = errorNumber;
|
||||
}
|
||||
}
|
||||
|
||||
/// <summary>
|
||||
/// Signals a permanent database failure that must not be retried — the SQL-path
|
||||
/// parallel of <see cref="PermanentExternalSystemException"/>. Returned
|
||||
/// synchronously to the calling script on the immediate attempt and parks the
|
||||
/// message on the store-and-forward retry path.
|
||||
/// </summary>
|
||||
public class PermanentDatabaseException : Exception
|
||||
{
|
||||
/// <summary>Gets the SQL Server error number that caused the failure, if known.</summary>
|
||||
public int? SqlErrorNumber { get; }
|
||||
|
||||
/// <summary>Initializes a new <see cref="PermanentDatabaseException"/>.</summary>
|
||||
/// <param name="message">The error message.</param>
|
||||
/// <param name="errorNumber">The SQL Server error number, if available.</param>
|
||||
/// <param name="innerException">Optional inner exception (typically the original <see cref="SqlException"/>).</param>
|
||||
public PermanentDatabaseException(string message, int? errorNumber = null, Exception? innerException = null)
|
||||
: base(message, innerException)
|
||||
{
|
||||
SqlErrorNumber = errorNumber;
|
||||
}
|
||||
}
|
||||
@@ -111,6 +111,23 @@ public interface ISiteHealthCollector
|
||||
/// <param name="count">The number of parked messages.</param>
|
||||
void SetParkedMessageCount(int count);
|
||||
|
||||
/// <summary>
|
||||
/// Site Event Logging (#12) M2.16 (#30) — replace the latest cumulative
|
||||
/// site-event-log write-failure count (SQLite error, disk full,
|
||||
/// bounded-queue overflow drop) used by the next <see cref="CollectReport"/>
|
||||
/// call. Refreshed periodically by the <c>SiteEventLogFailureCountReporter</c>
|
||||
/// hosted service. Point-in-time: the value is NOT reset on
|
||||
/// <see cref="CollectReport"/>; it carries forward until the next poller
|
||||
/// refresh. Default interface implementation is a no-op so existing test
|
||||
/// fakes continue to compile without per-fake updates.
|
||||
/// </summary>
|
||||
/// <param name="count">The cumulative failed-write count from <c>ISiteEventLogger.FailedWriteCount</c>.</param>
|
||||
void SetSiteEventLogWriteFailures(long count)
|
||||
{
|
||||
// Default no-op so test fakes do not need to be updated. The real
|
||||
// SiteHealthCollector overrides this with the Interlocked.Exchange store.
|
||||
}
|
||||
|
||||
/// <summary>
|
||||
/// Sets the hostname of this node.
|
||||
/// </summary>
|
||||
|
||||
@@ -1,11 +1,25 @@
|
||||
using Microsoft.Extensions.DependencyInjection;
|
||||
using Microsoft.Extensions.DependencyInjection.Extensions;
|
||||
using Microsoft.Extensions.Hosting;
|
||||
using Microsoft.Extensions.Logging;
|
||||
using Microsoft.Extensions.Options;
|
||||
|
||||
namespace ZB.MOM.WW.ScadaBridge.HealthMonitoring;
|
||||
|
||||
public static class ServiceCollectionExtensions
|
||||
{
|
||||
/// <summary>
|
||||
/// Sentinel marker used by <see cref="AddSiteEventLogHealthMetricsBridge"/> to
|
||||
/// implement an idempotency guard. Because the reporter is registered via a
|
||||
/// factory-lambda overload of <c>AddHostedService</c>, its
|
||||
/// <see cref="Microsoft.Extensions.DependencyInjection.ServiceDescriptor.ImplementationType"/>
|
||||
/// is <see langword="null"/> — checking it would be a silent no-op. Registering
|
||||
/// this marker as a singleton and guarding on its <c>ServiceType</c> gives a
|
||||
/// reliable, allocation-free sentinel that works regardless of how the hosted
|
||||
/// service was wired.
|
||||
/// </summary>
|
||||
private sealed class SiteEventLogHealthMetricsBridgeMarker { }
|
||||
|
||||
/// <summary>
|
||||
/// Register site-side health monitoring services (metric collection + periodic reporting).
|
||||
/// Call this on site nodes only. For central, call AddCentralHealthAggregation() instead.
|
||||
@@ -50,6 +64,77 @@ public static class ServiceCollectionExtensions
|
||||
return services;
|
||||
}
|
||||
|
||||
/// <summary>
|
||||
/// Site Event Logging (#12) M2.16 (#30) — register the
|
||||
/// <see cref="SiteEventLogFailureCountReporter"/> hosted service that
|
||||
/// periodically reads the cumulative event-log write-failure count and
|
||||
/// pushes it into <see cref="ISiteHealthCollector"/> as a point-in-time
|
||||
/// snapshot (<c>SiteEventLogWriteFailures</c> on the site health report).
|
||||
/// </summary>
|
||||
/// <remarks>
|
||||
/// <para>
|
||||
/// Must be called AFTER <see cref="AddSiteHealthMonitoring"/> (or
|
||||
/// <see cref="AddHealthMonitoring"/>) which registers the
|
||||
/// <see cref="ISiteHealthCollector"/> the reporter depends on.
|
||||
/// </para>
|
||||
/// <para>
|
||||
/// <b>Why a Func<long> delegate instead of ISiteEventLogger.</b>
|
||||
/// A direct <c>HealthMonitoring → SiteEventLogging</c> reference is avoided to
|
||||
/// prevent an undesirable low-level coupling: <c>SiteEventLogging</c> is a
|
||||
/// leaf component that should not pull in higher-level infrastructure. The
|
||||
/// <see cref="Func{TResult}"/> delegate seam keeps the reference one-way and
|
||||
/// loose: the caller (Host site wiring) captures
|
||||
/// <c>ISiteEventLogger.FailedWriteCount</c> as a lambda and passes it here.
|
||||
/// Note: <c>HealthMonitoring → StoreAndForward → SiteEventLogging</c> already
|
||||
/// exists as a transitive path, so a direct reference would not introduce a
|
||||
/// cycle — the delegate is purely a coupling-avoidance measure.
|
||||
/// </para>
|
||||
/// <para>
|
||||
/// Idempotent — a <see cref="SiteEventLogHealthMetricsBridgeMarker"/> singleton
|
||||
/// is used as the sentinel. Because the reporter is registered via a factory-lambda
|
||||
/// overload of <c>AddHostedService</c>, its
|
||||
/// <see cref="Microsoft.Extensions.DependencyInjection.ServiceDescriptor.ImplementationType"/>
|
||||
/// is <see langword="null"/>; checking it would be a silent no-op and a second
|
||||
/// call would spin up a second polling timer. Guarding on the marker's
|
||||
/// <c>ServiceType</c> is always reliable regardless of how the hosted service
|
||||
/// was wired (AddHostedService has no TryAdd variant).
|
||||
/// </para>
|
||||
/// </remarks>
|
||||
/// <param name="services">The service collection to register into.</param>
|
||||
/// <param name="failedWriteCountProvider">
|
||||
/// A factory delegate that, given the root <see cref="IServiceProvider"/>,
|
||||
/// returns a <see cref="Func{TResult}"/> that reads the current cumulative
|
||||
/// event-log write-failure count. Typically:
|
||||
/// <c>sp => () => sp.GetRequiredService<ISiteEventLogger>().FailedWriteCount</c>.
|
||||
/// The factory is evaluated once at hosted-service resolution time; the inner
|
||||
/// <see cref="Func{TResult}"/> is called on every poll tick.
|
||||
/// </param>
|
||||
/// <returns>The same <see cref="IServiceCollection"/> for chaining.</returns>
|
||||
public static IServiceCollection AddSiteEventLogHealthMetricsBridge(
|
||||
this IServiceCollection services,
|
||||
Func<IServiceProvider, Func<long>> failedWriteCountProvider)
|
||||
{
|
||||
ArgumentNullException.ThrowIfNull(services);
|
||||
ArgumentNullException.ThrowIfNull(failedWriteCountProvider);
|
||||
|
||||
// Idempotent guard — uses the marker type rather than ImplementationType because
|
||||
// AddHostedService(factory-lambda) sets only ImplementationFactory and leaves
|
||||
// ImplementationType null; an ImplementationType == check is a silent no-op for
|
||||
// factory-registered services. The marker singleton's ServiceType is always set.
|
||||
if (services.Any(d => d.ServiceType == typeof(SiteEventLogHealthMetricsBridgeMarker)))
|
||||
{
|
||||
return services;
|
||||
}
|
||||
|
||||
services.AddSingleton<SiteEventLogHealthMetricsBridgeMarker>();
|
||||
services.AddHostedService(sp => new SiteEventLogFailureCountReporter(
|
||||
failedWriteCountProvider(sp),
|
||||
sp.GetRequiredService<ISiteHealthCollector>(),
|
||||
sp.GetRequiredService<ILogger<SiteEventLogFailureCountReporter>>()));
|
||||
|
||||
return services;
|
||||
}
|
||||
|
||||
/// <summary>
|
||||
/// HealthMonitoring-014: register the <see cref="HealthMonitoringOptionsValidator"/>
|
||||
/// so a misconfigured <c>ScadaBridge:HealthMonitoring</c> section (zero/negative
|
||||
|
||||
@@ -0,0 +1,146 @@
|
||||
using Microsoft.Extensions.Hosting;
|
||||
using Microsoft.Extensions.Logging;
|
||||
|
||||
namespace ZB.MOM.WW.ScadaBridge.HealthMonitoring;
|
||||
|
||||
/// <summary>
|
||||
/// Site Event Logging (#12) M2.16 (#30) — site-side hosted service that
|
||||
/// periodically reads the cumulative event-log write-failure count and pushes
|
||||
/// it into <see cref="ISiteHealthCollector"/> so the next
|
||||
/// <see cref="ISiteHealthCollector.CollectReport"/> emits a fresh
|
||||
/// <c>SiteEventLogWriteFailures</c> field on the site health report.
|
||||
/// </summary>
|
||||
/// <remarks>
|
||||
/// <para>
|
||||
/// <b>Why a Func<long> and not ISiteEventLogger directly.</b>
|
||||
/// A direct <c>HealthMonitoring → SiteEventLogging</c> reference is avoided
|
||||
/// to prevent an undesirable low-level coupling: <c>SiteEventLogging</c> is a
|
||||
/// leaf component that should not pull in higher-level infrastructure. Note that
|
||||
/// <c>HealthMonitoring → StoreAndForward → SiteEventLogging</c> already
|
||||
/// exists as a transitive path (confirmed: <c>StoreAndForward.csproj</c> references
|
||||
/// <c>SiteEventLogging.csproj</c>), so a direct reference would NOT introduce a
|
||||
/// cycle — the delegate is purely a coupling-avoidance measure. The
|
||||
/// <see cref="Func{TResult}"/> seam lets the caller (Host site wiring) capture
|
||||
/// <c>ISiteEventLogger.FailedWriteCount</c> as a lambda at registration time; this
|
||||
/// service reads only the numeric result. The delegate approach is a standard
|
||||
/// pattern for counter bridges and keeps the registration path self-documenting.
|
||||
/// </para>
|
||||
/// <para>
|
||||
/// <b>Cadence.</b> 30 s by default — the same cadence as
|
||||
/// <c>SiteAuditBacklogReporter</c>, which is coarse enough to stay within
|
||||
/// the health-report interval budget while keeping the central dashboard
|
||||
/// current.
|
||||
/// </para>
|
||||
/// <para>
|
||||
/// <b>Failure containment.</b> Any unexpected exception during the probe is
|
||||
/// caught and logged; the next tick retries. Mirrors
|
||||
/// <c>SiteAuditBacklogReporter</c>'s "exception logged, not propagated"
|
||||
/// contract.
|
||||
/// </para>
|
||||
/// </remarks>
|
||||
public sealed class SiteEventLogFailureCountReporter : IHostedService, IDisposable
|
||||
{
|
||||
/// <summary>
|
||||
/// Default poll cadence. Matches <c>SiteAuditBacklogReporter.DefaultRefreshInterval</c>
|
||||
/// (30 s) — coarse enough to amortise the read across many reports, fine
|
||||
/// enough that the central dashboard never lags by more than one
|
||||
/// health-report interval.
|
||||
/// </summary>
|
||||
internal static readonly TimeSpan DefaultRefreshInterval = TimeSpan.FromSeconds(30);
|
||||
|
||||
private readonly Func<long> _failedWriteCountProvider;
|
||||
private readonly ISiteHealthCollector _collector;
|
||||
private readonly ILogger<SiteEventLogFailureCountReporter> _logger;
|
||||
private readonly TimeSpan _refreshInterval;
|
||||
private CancellationTokenSource? _cts;
|
||||
private Task? _loop;
|
||||
|
||||
/// <summary>Initializes a new instance of <see cref="SiteEventLogFailureCountReporter"/>.</summary>
|
||||
/// <param name="failedWriteCountProvider">
|
||||
/// A delegate that returns the current cumulative event-log write-failure count.
|
||||
/// Typically wired as <c>() => sp.GetRequiredService<ISiteEventLogger>().FailedWriteCount</c>
|
||||
/// in the Host site composition root.
|
||||
/// </param>
|
||||
/// <param name="collector">The site health collector that receives the failure-count snapshot.</param>
|
||||
/// <param name="logger">Logger instance.</param>
|
||||
/// <param name="refreshInterval">Poll interval override; defaults to <see cref="DefaultRefreshInterval"/> (30 s).</param>
|
||||
public SiteEventLogFailureCountReporter(
|
||||
Func<long> failedWriteCountProvider,
|
||||
ISiteHealthCollector collector,
|
||||
ILogger<SiteEventLogFailureCountReporter> logger,
|
||||
TimeSpan? refreshInterval = null)
|
||||
{
|
||||
_failedWriteCountProvider = failedWriteCountProvider
|
||||
?? throw new ArgumentNullException(nameof(failedWriteCountProvider));
|
||||
_collector = collector ?? throw new ArgumentNullException(nameof(collector));
|
||||
_logger = logger ?? throw new ArgumentNullException(nameof(logger));
|
||||
_refreshInterval = refreshInterval ?? DefaultRefreshInterval;
|
||||
}
|
||||
|
||||
/// <summary>Starts the background polling loop, running an immediate first probe before entering the timed cycle.</summary>
|
||||
/// <param name="ct">Cancellation token signalling host shutdown.</param>
|
||||
/// <returns>A task that represents the asynchronous operation.</returns>
|
||||
public Task StartAsync(CancellationToken ct)
|
||||
{
|
||||
// Linked CTS lets StopAsync's cancellation AND the host's shutdown
|
||||
// token both terminate the loop; either side firing aborts the
|
||||
// pending Task.Delay.
|
||||
_cts = CancellationTokenSource.CreateLinkedTokenSource(ct);
|
||||
_loop = Task.Run(() => RunLoopAsync(_cts.Token));
|
||||
return Task.CompletedTask;
|
||||
}
|
||||
|
||||
private async Task RunLoopAsync(CancellationToken ct)
|
||||
{
|
||||
// First tick runs immediately so the very first health report after
|
||||
// process start carries a real failure-count snapshot — without this
|
||||
// the dashboard would show 0 for the first 30 s after a deploy even
|
||||
// if failures had already accumulated.
|
||||
SafeProbe();
|
||||
|
||||
while (!ct.IsCancellationRequested)
|
||||
{
|
||||
try
|
||||
{
|
||||
await Task.Delay(_refreshInterval, ct).ConfigureAwait(false);
|
||||
}
|
||||
catch (OperationCanceledException)
|
||||
{
|
||||
break;
|
||||
}
|
||||
|
||||
SafeProbe();
|
||||
}
|
||||
}
|
||||
|
||||
private void SafeProbe()
|
||||
{
|
||||
try
|
||||
{
|
||||
var count = _failedWriteCountProvider();
|
||||
_collector.SetSiteEventLogWriteFailures(count);
|
||||
}
|
||||
catch (Exception ex)
|
||||
{
|
||||
// Catch-all is deliberate: the hosted service must survive every
|
||||
// class of probe failure so the next tick gets a chance. Mirrors
|
||||
// SiteAuditBacklogReporter's "exception logged, not propagated" contract.
|
||||
_logger.LogWarning(ex, "SiteEventLogFailureCountReporter probe failed; next tick will retry.");
|
||||
}
|
||||
}
|
||||
|
||||
/// <summary>Signals the polling loop to stop and waits for it to complete.</summary>
|
||||
/// <param name="ct">Cancellation token (not used; the internal CTS governs shutdown).</param>
|
||||
/// <returns>A task that represents the asynchronous operation.</returns>
|
||||
public Task StopAsync(CancellationToken ct)
|
||||
{
|
||||
_cts?.Cancel();
|
||||
return _loop ?? Task.CompletedTask;
|
||||
}
|
||||
|
||||
/// <summary>Releases the internal <see cref="CancellationTokenSource"/> used to stop the polling loop.</summary>
|
||||
public void Dispose()
|
||||
{
|
||||
_cts?.Dispose();
|
||||
}
|
||||
}
|
||||
@@ -17,6 +17,7 @@ public class SiteHealthCollector : ISiteHealthCollector
|
||||
private int _siteAuditWriteFailures;
|
||||
private int _auditRedactionFailures;
|
||||
private volatile SiteAuditBacklogSnapshot? _siteAuditBacklog;
|
||||
private long _siteEventLogWriteFailures;
|
||||
private readonly ConcurrentDictionary<string, ConnectionHealth> _connectionStatuses = new();
|
||||
private readonly ConcurrentDictionary<string, TagResolutionStatus> _tagResolutionCounts = new();
|
||||
private readonly ConcurrentDictionary<string, string> _connectionEndpoints = new();
|
||||
@@ -77,6 +78,12 @@ public class SiteHealthCollector : ISiteHealthCollector
|
||||
_siteAuditBacklog = snapshot ?? throw new ArgumentNullException(nameof(snapshot));
|
||||
}
|
||||
|
||||
/// <inheritdoc />
|
||||
public void SetSiteEventLogWriteFailures(long count)
|
||||
{
|
||||
Interlocked.Exchange(ref _siteEventLogWriteFailures, count);
|
||||
}
|
||||
|
||||
/// <inheritdoc />
|
||||
public void UpdateConnectionHealth(string connectionName, ConnectionHealth health)
|
||||
{
|
||||
@@ -206,6 +213,7 @@ public class SiteHealthCollector : ISiteHealthCollector
|
||||
ClusterNodes: _clusterNodes?.ToList(),
|
||||
SiteAuditWriteFailures: siteAuditWriteFailures,
|
||||
AuditRedactionFailure: auditRedactionFailures,
|
||||
SiteAuditBacklog: _siteAuditBacklog);
|
||||
SiteAuditBacklog: _siteAuditBacklog,
|
||||
SiteEventLogWriteFailures: Interlocked.Read(ref _siteEventLogWriteFailures));
|
||||
}
|
||||
}
|
||||
|
||||
@@ -7,6 +7,8 @@ public class DatabaseOptions
|
||||
{
|
||||
/// <summary>Connection string for the central configuration SQL Server database.</summary>
|
||||
public string? ConfigurationDb { get; set; }
|
||||
/// <summary>Connection string for the central machine-data SQL Server database.</summary>
|
||||
public string? MachineDataDb { get; set; }
|
||||
/// <summary>File system path to the site-local SQLite database directory.</summary>
|
||||
public string? SiteDbPath { get; set; }
|
||||
}
|
||||
|
||||
@@ -0,0 +1,175 @@
|
||||
using Akka.Actor;
|
||||
using Microsoft.Extensions.DependencyInjection;
|
||||
using Microsoft.Extensions.Diagnostics.HealthChecks;
|
||||
using Microsoft.Extensions.Logging;
|
||||
|
||||
namespace ZB.MOM.WW.ScadaBridge.Host.Health;
|
||||
|
||||
/// <summary>
|
||||
/// M2.14 (#28): readiness check that verifies every <b>required central cluster
|
||||
/// singleton</b> is reachable from this node, satisfying the "required cluster
|
||||
/// singletons running (if applicable)" clause of REQ-HOST-4a. Register it
|
||||
/// <see cref="ZB.MOM.WW.Health.ZbHealthTags.Ready"/>-tagged in the Central-role
|
||||
/// <c>AddHealthChecks()</c> chain only, so it is naturally role-scoped (site nodes
|
||||
/// never register it).
|
||||
/// </summary>
|
||||
/// <remarks>
|
||||
/// <para>
|
||||
/// <b>Probe strategy.</b> Each central singleton has a local
|
||||
/// <c>ClusterSingletonProxy</c> actor (created unconditionally in
|
||||
/// <c>AkkaHostedService.RegisterCentralActors</c>). The proxy actor exists locally
|
||||
/// as soon as it is created, so merely resolving its path proves nothing about the
|
||||
/// singleton itself. Instead we <see cref="ActorRefImplicitSenderExtensions.Ask{T}(ICanTell, object, TimeSpan?)"/>
|
||||
/// the proxy an <see cref="Identify"/> with a short bounded per-singleton timeout and
|
||||
/// expect an <see cref="ActorIdentity"/> whose <see cref="ActorIdentity.Subject"/> is
|
||||
/// non-null. The proxy buffers and forwards to the live singleton, so a non-null
|
||||
/// Subject within the timeout means the singleton is running and reachable; a null
|
||||
/// Subject or a timeout means it is unreachable. Probes run concurrently
|
||||
/// (<see cref="Task.WhenAll(System.Collections.Generic.IEnumerable{Task})"/>) so the
|
||||
/// whole check stays cheap and readiness polling stays fast.
|
||||
/// </para>
|
||||
/// <para>
|
||||
/// <b>Required-always vs if-applicable.</b> All five central singleton proxies are
|
||||
/// created unconditionally on a central node (there is no feature/config gate around
|
||||
/// any of them), so all five are treated as required-always here. If a future
|
||||
/// singleton is created behind a feature flag, it should NOT be added to
|
||||
/// <see cref="RequiredSingletonProxyNames"/> — "if applicable" means skip when its
|
||||
/// feature is off.
|
||||
/// </para>
|
||||
/// <para>
|
||||
/// <b>Failover flakiness.</b> During a brief singleton handover the singleton may be
|
||||
/// momentarily unreachable through the proxy. The bounded per-singleton timeout maps
|
||||
/// that to Unhealthy (we never throw and never retry — retries would make the probe
|
||||
/// slow). Readiness flapping briefly during a failover is acceptable and correct: a
|
||||
/// node mid-handover is legitimately not fully ready. We deliberately accept that
|
||||
/// tradeoff rather than masking it with retries.
|
||||
/// </para>
|
||||
/// <para>
|
||||
/// <b>No leadership requirement.</b> The proxy reaches the singleton from either node
|
||||
/// (active or standby), so a ready standby still reports Healthy here — readiness must
|
||||
/// NOT require cluster leadership (that is the Active tier's job).
|
||||
/// </para>
|
||||
/// <para>
|
||||
/// The <see cref="ActorSystem"/> is resolved lazily from DI per probe, mirroring
|
||||
/// <c>AkkaClusterHealthCheck</c>; if it is not yet available (startup race) the check
|
||||
/// returns Unhealthy rather than throwing.
|
||||
/// </para>
|
||||
/// </remarks>
|
||||
public sealed class RequiredSingletonsHealthCheck : IHealthCheck
|
||||
{
|
||||
/// <summary>
|
||||
/// Local actor names (under <c>/user</c>) of the <c>ClusterSingletonProxy</c>
|
||||
/// actors for the singletons that must always be running on a central node.
|
||||
/// Matches the unconditional proxy registrations in
|
||||
/// <c>AkkaHostedService.RegisterCentralActors</c>.
|
||||
/// </summary>
|
||||
public static readonly IReadOnlyList<string> RequiredSingletonProxyNames = new[]
|
||||
{
|
||||
"notification-outbox-proxy",
|
||||
"audit-log-ingest-proxy",
|
||||
"site-call-audit-proxy",
|
||||
"audit-log-purge-proxy",
|
||||
"site-audit-reconciliation-proxy",
|
||||
};
|
||||
|
||||
// Short, bounded per-singleton timeout. Kept small so readiness polling stays
|
||||
// fast; a singleton in mid-handover that does not answer within this window is
|
||||
// (correctly) treated as momentarily unreachable. Do NOT add retries here.
|
||||
private static readonly TimeSpan ProbeTimeout = TimeSpan.FromSeconds(2);
|
||||
|
||||
private readonly IServiceProvider _serviceProvider;
|
||||
private readonly ILogger<RequiredSingletonsHealthCheck> _logger;
|
||||
|
||||
/// <summary>Initializes a new <see cref="RequiredSingletonsHealthCheck"/>.</summary>
|
||||
/// <param name="serviceProvider">
|
||||
/// Application service provider; the <see cref="ActorSystem"/> is resolved lazily so the
|
||||
/// check is startup-safe (Unhealthy, never throwing, if Akka is not yet up).
|
||||
/// </param>
|
||||
/// <param name="logger">Logger for diagnostic detail on unreachable singletons.</param>
|
||||
public RequiredSingletonsHealthCheck(
|
||||
IServiceProvider serviceProvider,
|
||||
ILogger<RequiredSingletonsHealthCheck> logger)
|
||||
{
|
||||
_serviceProvider = serviceProvider ?? throw new ArgumentNullException(nameof(serviceProvider));
|
||||
_logger = logger ?? throw new ArgumentNullException(nameof(logger));
|
||||
}
|
||||
|
||||
/// <inheritdoc />
|
||||
public async Task<HealthCheckResult> CheckHealthAsync(
|
||||
HealthCheckContext context,
|
||||
CancellationToken cancellationToken = default)
|
||||
{
|
||||
// CheckHealthAsync must NEVER throw — catch everything and map to Unhealthy
|
||||
// with a descriptive message. An escaping exception would be recorded as
|
||||
// Unhealthy anyway, but a thrown exception loses the descriptive message.
|
||||
try
|
||||
{
|
||||
var system = _serviceProvider.GetService<ActorSystem>();
|
||||
if (system is null)
|
||||
return HealthCheckResult.Unhealthy("ActorSystem not yet available.");
|
||||
|
||||
// Probe each required singleton concurrently so the whole check is bounded
|
||||
// by ~ProbeTimeout, not the sum of the per-singleton timeouts.
|
||||
var probes = RequiredSingletonProxyNames
|
||||
.Select(name => ProbeAsync(system, name, cancellationToken))
|
||||
.ToArray();
|
||||
|
||||
var results = await Task.WhenAll(probes).ConfigureAwait(false);
|
||||
|
||||
var unreachable = results
|
||||
.Where(r => !r.Reachable)
|
||||
.Select(r => r.Name)
|
||||
.ToList();
|
||||
|
||||
if (unreachable.Count == 0)
|
||||
return HealthCheckResult.Healthy(
|
||||
$"All {RequiredSingletonProxyNames.Count} required cluster singletons are reachable.");
|
||||
|
||||
var joined = string.Join(", ", unreachable);
|
||||
_logger.LogWarning(
|
||||
"Readiness degraded: required cluster singleton(s) unreachable: {Unreachable}",
|
||||
joined);
|
||||
return HealthCheckResult.Unhealthy(
|
||||
$"Required cluster singleton(s) unreachable: {joined}.");
|
||||
}
|
||||
catch (Exception ex)
|
||||
{
|
||||
// Defensive: any unexpected failure (including OperationCanceledException
|
||||
// on shutdown) degrades readiness rather than escaping the check.
|
||||
return HealthCheckResult.Unhealthy(
|
||||
"Failed to probe required cluster singletons.", ex);
|
||||
}
|
||||
}
|
||||
|
||||
/// <summary>
|
||||
/// Asks the named local proxy an <see cref="Identify"/> with a bounded timeout.
|
||||
/// Reachable iff a non-null <see cref="ActorIdentity.Subject"/> comes back in time.
|
||||
/// A null Subject (path not present) or a timeout/exception → not reachable. This
|
||||
/// method itself never throws.
|
||||
/// </summary>
|
||||
private async Task<(string Name, bool Reachable)> ProbeAsync(
|
||||
ActorSystem system,
|
||||
string proxyName,
|
||||
CancellationToken cancellationToken)
|
||||
{
|
||||
try
|
||||
{
|
||||
// ActorSelection so a missing path resolves an ActorIdentity with a null
|
||||
// Subject (rather than throwing) within the bounded timeout.
|
||||
var selection = system.ActorSelection($"/user/{proxyName}");
|
||||
|
||||
var identity = await selection
|
||||
.Ask<ActorIdentity>(new Identify(proxyName), ProbeTimeout, cancellationToken)
|
||||
.ConfigureAwait(false);
|
||||
|
||||
return (proxyName, identity.Subject is not null);
|
||||
}
|
||||
catch (Exception)
|
||||
{
|
||||
// Timeout / cancellation / any failure → momentarily unreachable. Bounded,
|
||||
// no retry — readiness may briefly flap during a singleton handover, which
|
||||
// is the correct signal for a node mid-handover.
|
||||
return (proxyName, false);
|
||||
}
|
||||
}
|
||||
}
|
||||
@@ -202,6 +202,18 @@ try
|
||||
failureStatus: null,
|
||||
tags: new[] { ZbHealthTags.Ready },
|
||||
args: AkkaClusterStatusPolicy.Default)
|
||||
// M2.14 (#28): readiness ALSO reflects "required cluster singletons running"
|
||||
// (REQ-HOST-4a). Probes each central singleton's local ClusterSingletonProxy
|
||||
// with a bounded Identify and degrades to Unhealthy if any required singleton
|
||||
// is unreachable. Registered inside the Central-role branch (this is it) so the
|
||||
// check is naturally role-scoped — site nodes never run it. It resolves
|
||||
// ActorSystem from DI per probe, like the akka-cluster check above, and is
|
||||
// leadership-agnostic so a ready standby still reports ready (the proxy reaches
|
||||
// the singleton from either node).
|
||||
.AddTypeActivatedCheck<RequiredSingletonsHealthCheck>(
|
||||
"required-singletons",
|
||||
failureStatus: null,
|
||||
tags: new[] { ZbHealthTags.Ready })
|
||||
.AddTypeActivatedCheck<ActiveNodeHealthCheck>(
|
||||
"active-node",
|
||||
failureStatus: null,
|
||||
|
||||
@@ -58,6 +58,16 @@ public static class SiteServiceRegistration
|
||||
services.AddStoreAndForward();
|
||||
services.AddSiteEventLogging();
|
||||
|
||||
// Site Event Logging (#12) M2.16 (#30) — bridge ISiteEventLogger.FailedWriteCount
|
||||
// into the site health report as a point-in-time SiteEventLogWriteFailures field.
|
||||
// Must come AFTER both AddSiteHealthMonitoring (registers ISiteHealthCollector) and
|
||||
// AddSiteEventLogging (registers ISiteEventLogger). The outer Func<IServiceProvider, …>
|
||||
// is evaluated once at hosted-service resolution time (root IServiceProvider is available);
|
||||
// the inner Func<long> is called on every poll tick and reads FailedWriteCount from the
|
||||
// already-resolved ISiteEventLogger singleton.
|
||||
services.AddSiteEventLogHealthMetricsBridge(
|
||||
sp => () => sp.GetRequiredService<ISiteEventLogger>().FailedWriteCount);
|
||||
|
||||
// Audit Log (#23) — site-side hot-path writer + telemetry collaborators.
|
||||
// The SiteAuditTelemetryActor itself is registered by AkkaHostedService
|
||||
// in the site-role block; this call wires every DI dependency it (and
|
||||
@@ -96,6 +106,19 @@ public static class SiteServiceRegistration
|
||||
return new AkkaClusterNodeProvider(akkaService, siteRole);
|
||||
});
|
||||
|
||||
// SiteEventLogging-019 / #29 (M2.15): the EventLogPurgeService runs on every
|
||||
// site host node but consults this optional gate each tick and early-exits on
|
||||
// the standby. Register it to delegate to IClusterNodeProvider.SelfIsPrimary
|
||||
// (the canonical "this node is Up AND cluster leader" check) so purge runs ONLY
|
||||
// on the active node — no duplicated cluster logic. Non-clustered test hosts that
|
||||
// never call SiteServiceRegistration leave it unregistered, so the purge defaults
|
||||
// to always-run (the pre-fix behaviour, preserved).
|
||||
services.AddSingleton<SiteEventLogActiveNodeCheck>(sp =>
|
||||
{
|
||||
var nodeProvider = sp.GetRequiredService<IClusterNodeProvider>();
|
||||
return () => nodeProvider.SelfIsPrimary;
|
||||
});
|
||||
|
||||
// Options binding
|
||||
BindSharedOptions(services, config);
|
||||
services.Configure<SiteRuntimeOptions>(config.GetSection("ScadaBridge:SiteRuntime"));
|
||||
|
||||
@@ -60,6 +60,9 @@ public static class StartupValidator
|
||||
.Require("ScadaBridge:Database:ConfigurationDb",
|
||||
_ => !string.IsNullOrEmpty(configuration.GetSection("ScadaBridge:Database")["ConfigurationDb"]),
|
||||
"connection string required for Central")
|
||||
.Require("ScadaBridge:Database:MachineDataDb",
|
||||
_ => !string.IsNullOrEmpty(configuration.GetSection("ScadaBridge:Database")["MachineDataDb"]),
|
||||
"connection string required for Central")
|
||||
// Task 1.4: the LDAP server key moved into the nested Security:Ldap
|
||||
// sub-section (bound to the shared LdapOptions). Validate the nested key so
|
||||
// the pre-host preflight still fails fast on a missing LDAP server for
|
||||
|
||||
@@ -4,8 +4,23 @@ using ZB.MOM.WW.ScadaBridge.Commons.Types.InboundApi;
|
||||
namespace ZB.MOM.WW.ScadaBridge.InboundAPI;
|
||||
|
||||
/// <summary>
|
||||
/// WP-2: Validates and deserializes JSON request body against method parameter definitions.
|
||||
/// Extended type system: Boolean, Integer, Float, String, Object, List.
|
||||
/// WP-2: Validates and deserializes a JSON request body against a method's
|
||||
/// parameter definitions. Extended type system: Boolean, Integer, Float,
|
||||
/// String, Object, List.
|
||||
///
|
||||
/// <para>
|
||||
/// InboundAPI-M2.6: validation is now RECURSIVE and type-aware for the
|
||||
/// extended <c>Object</c> / <c>List</c> types. Declared object fields are
|
||||
/// validated against their declared (nested) types, list elements against the
|
||||
/// declared element type, and scalars at any depth against the extended type —
|
||||
/// with path-qualified errors (e.g. <c>order.items[2].quantity</c>). The
|
||||
/// definition is read as JSON Schema (the canonical persisted format produced
|
||||
/// by the Central UI / migration); the legacy flat-array form is still
|
||||
/// accepted for transition safety. See
|
||||
/// <see cref="ZB.MOM.WW.ScadaBridge.Commons.Types.InboundApi.InboundApiSchema"/>
|
||||
/// for the shared recursive engine that <see cref="ReturnValueValidator"/>
|
||||
/// also uses.
|
||||
/// </para>
|
||||
/// </summary>
|
||||
public static class ParameterValidator
|
||||
{
|
||||
@@ -14,40 +29,34 @@ public static class ParameterValidator
|
||||
/// Returns deserialized parameters or an error message.
|
||||
/// </summary>
|
||||
/// <param name="body">The parsed JSON request body; null or undefined if no body was supplied.</param>
|
||||
/// <param name="parameterDefinitions">JSON-serialized list of <see cref="ZB.MOM.WW.ScadaBridge.Commons.Types.InboundApi.ParameterDefinition"/>; null or empty means no parameters are defined.</param>
|
||||
/// <param name="parameterDefinitions">JSON Schema describing the method's parameters (an object schema), or null/empty when no parameters are defined. The legacy flat-array form is also accepted.</param>
|
||||
/// <returns>A <see cref="ParameterValidationResult"/> with coerced parameter values on success, or an error message on failure.</returns>
|
||||
public static ParameterValidationResult Validate(
|
||||
JsonElement? body,
|
||||
string? parameterDefinitions)
|
||||
{
|
||||
if (string.IsNullOrEmpty(parameterDefinitions))
|
||||
{
|
||||
// No parameters defined — body should be empty or null
|
||||
return ParameterValidationResult.Valid(new Dictionary<string, object?>());
|
||||
}
|
||||
|
||||
List<ParameterDefinition> definitions;
|
||||
InboundApiSchema? schema;
|
||||
try
|
||||
{
|
||||
definitions = JsonSerializer.Deserialize<List<ParameterDefinition>>(
|
||||
parameterDefinitions,
|
||||
new JsonSerializerOptions { PropertyNameCaseInsensitive = true })
|
||||
?? [];
|
||||
schema = InboundApiSchema.Parse(parameterDefinitions);
|
||||
}
|
||||
catch (JsonException)
|
||||
{
|
||||
return ParameterValidationResult.Invalid("Invalid parameter definitions in method configuration");
|
||||
}
|
||||
|
||||
if (definitions.Count == 0)
|
||||
// No parameters defined (or an object schema with no declared fields) —
|
||||
// the body is unconstrained and yields an empty parameter set.
|
||||
if (schema is null || schema.Type != "object" || schema.Fields.Count == 0)
|
||||
{
|
||||
return ParameterValidationResult.Valid(new Dictionary<string, object?>());
|
||||
}
|
||||
|
||||
if (body == null || body.Value.ValueKind == JsonValueKind.Null || body.Value.ValueKind == JsonValueKind.Undefined)
|
||||
if (body == null
|
||||
|| body.Value.ValueKind == JsonValueKind.Null
|
||||
|| body.Value.ValueKind == JsonValueKind.Undefined)
|
||||
{
|
||||
// Check if all parameters are optional
|
||||
var required = definitions.Where(d => d.Required).ToList();
|
||||
var required = schema.Fields.Where(f => f.Required).ToList();
|
||||
if (required.Count > 0)
|
||||
{
|
||||
return ParameterValidationResult.Invalid(
|
||||
@@ -62,86 +71,51 @@ public static class ParameterValidator
|
||||
return ParameterValidationResult.Invalid("Request body must be a JSON object");
|
||||
}
|
||||
|
||||
var result = new Dictionary<string, object?>();
|
||||
// Recursively type-check the whole body against the declared object
|
||||
// schema (nested Object fields, List element types, scalars at any
|
||||
// depth, undeclared-field rejection) with path-qualified errors.
|
||||
var errors = new List<string>();
|
||||
|
||||
// InboundAPI-010: report top-level body fields that do not match any defined
|
||||
// parameter, so a caller learns about a typo'd parameter name instead of
|
||||
// having the field silently ignored.
|
||||
var defined = new HashSet<string>(definitions.Select(d => d.Name), StringComparer.Ordinal);
|
||||
var unexpected = body.Value.EnumerateObject()
|
||||
.Select(p => p.Name)
|
||||
.Where(name => !defined.Contains(name))
|
||||
.ToList();
|
||||
if (unexpected.Count > 0)
|
||||
{
|
||||
errors.Add($"Unexpected parameter(s): {string.Join(", ", unexpected)}");
|
||||
}
|
||||
|
||||
foreach (var def in definitions)
|
||||
{
|
||||
if (body.Value.TryGetProperty(def.Name, out var prop))
|
||||
{
|
||||
var (value, error) = CoerceValue(prop, def.Type, def.Name);
|
||||
if (error != null)
|
||||
{
|
||||
errors.Add(error);
|
||||
}
|
||||
else
|
||||
{
|
||||
result[def.Name] = value;
|
||||
}
|
||||
}
|
||||
else if (def.Required)
|
||||
{
|
||||
errors.Add($"Missing required parameter: {def.Name}");
|
||||
}
|
||||
}
|
||||
|
||||
schema.Validate(body.Value, string.Empty, errors);
|
||||
if (errors.Count > 0)
|
||||
{
|
||||
return ParameterValidationResult.Invalid(string.Join("; ", errors));
|
||||
}
|
||||
|
||||
// Materialize the coerced top-level parameter values for the script.
|
||||
var result = new Dictionary<string, object?>();
|
||||
foreach (var field in schema.Fields)
|
||||
{
|
||||
if (body.Value.TryGetProperty(field.Name, out var prop))
|
||||
{
|
||||
result[field.Name] = Materialize(prop, field.Schema);
|
||||
}
|
||||
}
|
||||
|
||||
return ParameterValidationResult.Valid(result);
|
||||
}
|
||||
|
||||
/// <summary>
|
||||
/// Coerces a JSON element to the declared parameter type. InboundAPI-010: the
|
||||
/// <c>Object</c> and <c>List</c> extended types are validated for JSON <em>shape</em>
|
||||
/// only (object vs. array) — there is no field-level or element-level type
|
||||
/// validation. A method script that needs a specific nested structure must
|
||||
/// validate it itself; invalid nested data surfaces as a runtime script error.
|
||||
/// Converts a validated JSON element to the CLR value handed to the script.
|
||||
/// Validation has already passed, so this only shapes the value: scalars to
|
||||
/// their primitive type, objects to <see cref="Dictionary{TKey,TValue}"/>,
|
||||
/// arrays to <see cref="List{T}"/>.
|
||||
/// </summary>
|
||||
private static (object? value, string? error) CoerceValue(JsonElement element, string expectedType, string paramName)
|
||||
private static object? Materialize(JsonElement element, InboundApiSchema schema)
|
||||
{
|
||||
return expectedType.ToLowerInvariant() switch
|
||||
if (element.ValueKind == JsonValueKind.Null)
|
||||
{
|
||||
"boolean" => element.ValueKind == JsonValueKind.True || element.ValueKind == JsonValueKind.False
|
||||
? (element.GetBoolean(), null)
|
||||
: (null, $"Parameter '{paramName}' must be a Boolean"),
|
||||
return null;
|
||||
}
|
||||
|
||||
"integer" => element.ValueKind == JsonValueKind.Number && element.TryGetInt64(out var intVal)
|
||||
? (intVal, null)
|
||||
: (null, $"Parameter '{paramName}' must be an Integer"),
|
||||
|
||||
"float" => element.ValueKind == JsonValueKind.Number
|
||||
? (element.GetDouble(), null)
|
||||
: (null, $"Parameter '{paramName}' must be a Float"),
|
||||
|
||||
"string" => element.ValueKind == JsonValueKind.String
|
||||
? (element.GetString(), null)
|
||||
: (null, $"Parameter '{paramName}' must be a String"),
|
||||
|
||||
"object" => element.ValueKind == JsonValueKind.Object
|
||||
? (JsonSerializer.Deserialize<Dictionary<string, object?>>(element.GetRawText()), null)
|
||||
: (null, $"Parameter '{paramName}' must be an Object"),
|
||||
|
||||
"list" => element.ValueKind == JsonValueKind.Array
|
||||
? (JsonSerializer.Deserialize<List<object?>>(element.GetRawText()), null)
|
||||
: (null, $"Parameter '{paramName}' must be a List"),
|
||||
|
||||
_ => (null, $"Unknown parameter type '{expectedType}' for parameter '{paramName}'")
|
||||
return schema.Type switch
|
||||
{
|
||||
"boolean" => element.GetBoolean(),
|
||||
"integer" => element.GetInt64(),
|
||||
"number" => element.GetDouble(),
|
||||
"string" => element.GetString(),
|
||||
"object" => JsonSerializer.Deserialize<Dictionary<string, object?>>(element.GetRawText()),
|
||||
"array" => JsonSerializer.Deserialize<List<object?>>(element.GetRawText()),
|
||||
_ => JsonSerializer.Deserialize<object?>(element.GetRawText()),
|
||||
};
|
||||
}
|
||||
}
|
||||
|
||||
@@ -1,4 +1,5 @@
|
||||
using System.Text.Json;
|
||||
using ZB.MOM.WW.ScadaBridge.Commons.Types.InboundApi;
|
||||
|
||||
namespace ZB.MOM.WW.ScadaBridge.InboundAPI;
|
||||
|
||||
@@ -10,13 +11,20 @@ namespace ZB.MOM.WW.ScadaBridge.InboundAPI;
|
||||
/// <see cref="ParameterValidator"/>.
|
||||
///
|
||||
/// <para>
|
||||
/// The return definition is a JSON array of <see cref="ReturnFieldDefinition"/>
|
||||
/// (the same <c>{name,type}</c> shape as a parameter definition). A method whose
|
||||
/// <c>ReturnDefinition</c> is null/empty is unconstrained — its return value is
|
||||
/// serialized as-is (backward compatible). Primitive fields (Boolean / Integer /
|
||||
/// Float / String) are type-checked; the extended <c>Object</c>/<c>List</c> types
|
||||
/// are shape-checked only (object vs. array), consistent with how
|
||||
/// <see cref="ParameterValidator"/> treats inbound extended types.
|
||||
/// The return definition is JSON Schema (the canonical persisted format; the
|
||||
/// legacy flat <c>[{name,type}]</c> array is still accepted for transition
|
||||
/// safety). A method whose <c>ReturnDefinition</c> is null/empty is
|
||||
/// unconstrained — its return value is serialized as-is (backward compatible).
|
||||
/// </para>
|
||||
///
|
||||
/// <para>
|
||||
/// InboundAPI-M2.6: validation is RECURSIVE and type-aware — declared object
|
||||
/// fields are validated against their declared (nested) types, list elements
|
||||
/// against the declared element type, and scalars at any depth — with
|
||||
/// path-qualified errors. The recursion is shared with
|
||||
/// <see cref="ParameterValidator"/> via
|
||||
/// <see cref="ZB.MOM.WW.ScadaBridge.Commons.Types.InboundApi.InboundApiSchema"/>,
|
||||
/// so the inbound and outbound type checks cannot drift apart.
|
||||
/// </para>
|
||||
/// </summary>
|
||||
public static class ReturnValueValidator
|
||||
@@ -27,8 +35,8 @@ public static class ReturnValueValidator
|
||||
/// definition is configured or the result conforms to it.
|
||||
/// </summary>
|
||||
/// <param name="resultJson">The JSON-serialized script return value to validate.</param>
|
||||
/// <param name="returnDefinition">JSON-serialized list of <see cref="ReturnFieldDefinition"/> entries, or null/empty to skip validation.</param>
|
||||
/// <returns>A <see cref="ReturnValidationResult"/> indicating success or describing the first validation failure.</returns>
|
||||
/// <param name="returnDefinition">JSON Schema describing the method's return value, or null/empty to skip validation. The legacy flat-array form is also accepted.</param>
|
||||
/// <returns>A <see cref="ReturnValidationResult"/> indicating success or describing the validation failures.</returns>
|
||||
public static ReturnValidationResult Validate(string? resultJson, string? returnDefinition)
|
||||
{
|
||||
if (string.IsNullOrWhiteSpace(returnDefinition))
|
||||
@@ -37,13 +45,10 @@ public static class ReturnValueValidator
|
||||
return ReturnValidationResult.Valid();
|
||||
}
|
||||
|
||||
List<ReturnFieldDefinition> fields;
|
||||
InboundApiSchema? schema;
|
||||
try
|
||||
{
|
||||
fields = JsonSerializer.Deserialize<List<ReturnFieldDefinition>>(
|
||||
returnDefinition,
|
||||
new JsonSerializerOptions { PropertyNameCaseInsensitive = true })
|
||||
?? [];
|
||||
schema = InboundApiSchema.Parse(returnDefinition);
|
||||
}
|
||||
catch (JsonException)
|
||||
{
|
||||
@@ -51,11 +56,25 @@ public static class ReturnValueValidator
|
||||
"Invalid return definition in method configuration");
|
||||
}
|
||||
|
||||
if (fields.Count == 0)
|
||||
// A schema that declares no constraints (e.g. an object schema with no
|
||||
// fields) leaves the return value unconstrained.
|
||||
if (schema is null || (schema.Type == "object" && schema.Fields.Count == 0))
|
||||
{
|
||||
return ReturnValidationResult.Valid();
|
||||
}
|
||||
|
||||
// INTENTIONAL asymmetry with ParameterValidator:
|
||||
//
|
||||
// ParameterValidator has an early-return guard for "schema.Type != object"
|
||||
// because method parameters are ALWAYS a top-level JSON object (flat map of
|
||||
// name→value); a non-object parameter schema is treated as unconstrained.
|
||||
//
|
||||
// ReturnValueValidator does NOT guard on schema.Type here. A method may
|
||||
// declare a scalar return type (e.g. {"type":"string"} or {"type":"integer"})
|
||||
// and the script is expected to return exactly that scalar JSON value.
|
||||
// Guarding on type == "object" would silently bypass validation for scalar
|
||||
// and array return schemas — do NOT add that guard here.
|
||||
|
||||
if (string.IsNullOrWhiteSpace(resultJson))
|
||||
{
|
||||
return ReturnValidationResult.Invalid(
|
||||
@@ -63,75 +82,37 @@ public static class ReturnValueValidator
|
||||
}
|
||||
|
||||
JsonElement root;
|
||||
JsonDocument doc;
|
||||
try
|
||||
{
|
||||
using var doc = JsonDocument.Parse(resultJson);
|
||||
root = doc.RootElement.Clone();
|
||||
doc = JsonDocument.Parse(resultJson);
|
||||
}
|
||||
catch (JsonException)
|
||||
{
|
||||
return ReturnValidationResult.Invalid("Script return value is not valid JSON");
|
||||
}
|
||||
|
||||
if (root.ValueKind != JsonValueKind.Object)
|
||||
using (doc)
|
||||
{
|
||||
return ReturnValidationResult.Invalid(
|
||||
"Method declares a return structure but the script did not return an object");
|
||||
}
|
||||
root = doc.RootElement;
|
||||
|
||||
var errors = new List<string>();
|
||||
foreach (var field in fields)
|
||||
{
|
||||
if (!root.TryGetProperty(field.Name, out var value))
|
||||
// A JSON null result against a declared structure is treated as
|
||||
// "no value returned" (preserves the prior contract).
|
||||
if (root.ValueKind == JsonValueKind.Null)
|
||||
{
|
||||
errors.Add($"missing return field '{field.Name}'");
|
||||
continue;
|
||||
return ReturnValidationResult.Invalid(
|
||||
"Method declares a return structure but the script returned no value");
|
||||
}
|
||||
|
||||
var typeError = CheckFieldType(value, field.Type, field.Name);
|
||||
if (typeError != null)
|
||||
errors.Add(typeError);
|
||||
var errors = new List<string>();
|
||||
schema.Validate(root, string.Empty, errors);
|
||||
|
||||
return errors.Count > 0
|
||||
? ReturnValidationResult.Invalid(
|
||||
$"Return value does not match the declared return definition: {string.Join("; ", errors)}")
|
||||
: ReturnValidationResult.Valid();
|
||||
}
|
||||
|
||||
return errors.Count > 0
|
||||
? ReturnValidationResult.Invalid(
|
||||
$"Return value does not match the declared return definition: {string.Join("; ", errors)}")
|
||||
: ReturnValidationResult.Valid();
|
||||
}
|
||||
|
||||
private static string? CheckFieldType(JsonElement value, string declaredType, string fieldName)
|
||||
{
|
||||
// A null value satisfies any field type — the script may legitimately omit
|
||||
// optional data; only a missing field (handled by the caller) is an error.
|
||||
if (value.ValueKind == JsonValueKind.Null)
|
||||
return null;
|
||||
|
||||
var ok = declaredType.ToLowerInvariant() switch
|
||||
{
|
||||
"boolean" => value.ValueKind is JsonValueKind.True or JsonValueKind.False,
|
||||
"integer" => value.ValueKind == JsonValueKind.Number && value.TryGetInt64(out _),
|
||||
"float" => value.ValueKind == JsonValueKind.Number,
|
||||
"string" => value.ValueKind == JsonValueKind.String,
|
||||
"object" => value.ValueKind == JsonValueKind.Object,
|
||||
"list" => value.ValueKind == JsonValueKind.Array,
|
||||
_ => true, // unknown declared type — do not block the response
|
||||
};
|
||||
|
||||
return ok ? null : $"return field '{fieldName}' must be {declaredType}";
|
||||
}
|
||||
}
|
||||
|
||||
/// <summary>
|
||||
/// InboundAPI-014: one field of a method's declared return structure — the
|
||||
/// deserialized form of an entry in <c>ApiMethod.ReturnDefinition</c>. Defined in
|
||||
/// this module (not Commons) because the inbound API is currently its only consumer.
|
||||
/// </summary>
|
||||
public sealed class ReturnFieldDefinition
|
||||
{
|
||||
/// <summary>Field name as it must appear in the script return object.</summary>
|
||||
public string Name { get; set; } = string.Empty;
|
||||
/// <summary>Expected JSON type of this field (e.g., "string", "integer", "boolean", "object", "list").</summary>
|
||||
public string Type { get; set; } = "String";
|
||||
}
|
||||
|
||||
/// <summary>
|
||||
|
||||
@@ -0,0 +1,231 @@
|
||||
using System.Security.Claims;
|
||||
using Microsoft.Extensions.Logging;
|
||||
using Microsoft.Extensions.Options;
|
||||
using ZB.MOM.WW.Auth.Abstractions.Roles;
|
||||
|
||||
namespace ZB.MOM.WW.ScadaBridge.Security;
|
||||
|
||||
/// <summary>
|
||||
/// The outcome of a single cookie <c>OnValidatePrincipal</c> evaluation. The thin
|
||||
/// <c>OnValidatePrincipal</c> lambda translates this into the matching
|
||||
/// <c>CookieValidatePrincipalContext</c> calls (<c>RejectPrincipal</c> /
|
||||
/// <c>ReplacePrincipal</c> + <c>ShouldRenew</c>); the decision itself is computed by
|
||||
/// <see cref="CookieSessionValidator"/> so it is unit-testable in isolation.
|
||||
/// </summary>
|
||||
/// <param name="Action">What the caller must do with the principal.</param>
|
||||
/// <param name="Principal">The replacement principal when <paramref name="Action"/> is <see cref="SessionValidationAction.Replace"/>; otherwise <c>null</c>.</param>
|
||||
public readonly record struct SessionValidationResult(
|
||||
SessionValidationAction Action,
|
||||
ClaimsPrincipal? Principal)
|
||||
{
|
||||
/// <summary>Keep the existing principal unchanged.</summary>
|
||||
public static SessionValidationResult Keep { get; } = new(SessionValidationAction.Keep, null);
|
||||
|
||||
/// <summary>Reject the principal (idle-timed-out) — the caller signs the user out.</summary>
|
||||
public static SessionValidationResult Reject { get; } = new(SessionValidationAction.Reject, null);
|
||||
|
||||
/// <summary>Replace the principal with a refreshed one and renew the cookie.</summary>
|
||||
/// <param name="principal">The rebuilt principal.</param>
|
||||
/// <returns>A replace result carrying <paramref name="principal"/>.</returns>
|
||||
public static SessionValidationResult Replace(ClaimsPrincipal principal) =>
|
||||
new(SessionValidationAction.Replace, principal);
|
||||
}
|
||||
|
||||
/// <summary>The action a cookie session validation requires of the caller.</summary>
|
||||
public enum SessionValidationAction
|
||||
{
|
||||
/// <summary>Leave the principal as-is (no idle timeout, no refresh due, or a refresh error we swallow).</summary>
|
||||
Keep,
|
||||
|
||||
/// <summary>The session is idle-timed-out; reject + sign out.</summary>
|
||||
Reject,
|
||||
|
||||
/// <summary>The role mapping was refreshed; replace the principal and renew the cookie.</summary>
|
||||
Replace,
|
||||
}
|
||||
|
||||
/// <summary>
|
||||
/// M2.19 (#15): the unit-testable core of the cookie <c>OnValidatePrincipal</c> event.
|
||||
/// Enforces the idle timeout and refreshes the session's role/scope claims from the
|
||||
/// STORED LDAP group claims via the DB-backed <see cref="RoleMapper"/> — <b>without any
|
||||
/// LDAP call</b> — picking up central role-mapping (and scope-rule) changes mid-session.
|
||||
/// </summary>
|
||||
/// <remarks>
|
||||
/// <para>
|
||||
/// <b>Idle timeout</b> (default <see cref="SecurityOptions.IdleTimeoutMinutes"/> = 30):
|
||||
/// computed from the <see cref="JwtTokenService.LastActivityClaimType"/> anchor. This is
|
||||
/// the explicit, deterministic counterpart to the cookie middleware's
|
||||
/// <c>ExpireTimeSpan</c> + <c>SlidingExpiration</c> window — both use the SAME idle
|
||||
/// timeout value, so the explicit check never contradicts the cookie window. A
|
||||
/// not-timed-out session has its last-activity anchor advanced to "now" (genuine
|
||||
/// request = activity), mirroring the sliding renew.
|
||||
/// </para>
|
||||
/// <para>
|
||||
/// <b>Role refresh</b> (default <see cref="SecurityOptions.RoleRefreshThresholdMinutes"/>
|
||||
/// = 15): when the elapsed time since <see cref="JwtTokenService.LastRoleRefreshClaimType"/>
|
||||
/// exceeds the threshold, the stored groups are re-mapped and the principal is rebuilt via
|
||||
/// <see cref="SessionClaimBuilder"/> (identical shape to <c>/auth/login</c>). If the DB
|
||||
/// mapping revoked the user's roles, the rebuilt principal reflects the loss.
|
||||
/// </para>
|
||||
/// <para>
|
||||
/// <b>Failure policy</b>: a refresh error (e.g. the mapper throws because the DB is
|
||||
/// unreachable) NEVER signs the user out and NEVER throws out of validation — it returns
|
||||
/// <see cref="SessionValidationResult.Keep"/>, mirroring the documented "LDAP failure:
|
||||
/// active sessions continue with current roles" stance. Only the explicit idle-timeout
|
||||
/// path rejects.
|
||||
/// </para>
|
||||
/// </remarks>
|
||||
public sealed class CookieSessionValidator
|
||||
{
|
||||
private readonly IGroupRoleMapper<string> _roleMapper;
|
||||
private readonly SecurityOptions _options;
|
||||
private readonly TimeProvider _timeProvider;
|
||||
private readonly ILogger<CookieSessionValidator> _logger;
|
||||
|
||||
/// <summary>Initializes the validator.</summary>
|
||||
/// <param name="roleMapper">The DB-backed group→role mapping seam (no LDAP) used for the mid-session refresh.</param>
|
||||
/// <param name="options">Security options carrying the idle and role-refresh thresholds.</param>
|
||||
/// <param name="timeProvider">Clock source; injected so tests can advance time deterministically.</param>
|
||||
/// <param name="logger">Logger instance.</param>
|
||||
public CookieSessionValidator(
|
||||
IGroupRoleMapper<string> roleMapper,
|
||||
IOptions<SecurityOptions> options,
|
||||
TimeProvider timeProvider,
|
||||
ILogger<CookieSessionValidator> logger)
|
||||
{
|
||||
_roleMapper = roleMapper ?? throw new ArgumentNullException(nameof(roleMapper));
|
||||
_options = (options ?? throw new ArgumentNullException(nameof(options))).Value;
|
||||
_timeProvider = timeProvider ?? throw new ArgumentNullException(nameof(timeProvider));
|
||||
_logger = logger ?? throw new ArgumentNullException(nameof(logger));
|
||||
}
|
||||
|
||||
/// <summary>
|
||||
/// Evaluates a cookie principal: enforces the idle timeout, then refreshes the
|
||||
/// role/scope claims from the stored LDAP groups when the role-refresh interval has
|
||||
/// elapsed. Never throws.
|
||||
/// </summary>
|
||||
/// <param name="principal">The current cookie principal under validation.</param>
|
||||
/// <param name="ct">Cancellation token (the request-aborted token in the pipeline).</param>
|
||||
/// <returns>The action the caller must take and any replacement principal.</returns>
|
||||
public async Task<SessionValidationResult> ValidateAsync(ClaimsPrincipal? principal, CancellationToken ct = default)
|
||||
{
|
||||
// An unauthenticated / null principal is left to the rest of the pipeline.
|
||||
if (principal?.Identity is not { IsAuthenticated: true })
|
||||
{
|
||||
return SessionValidationResult.Keep;
|
||||
}
|
||||
|
||||
var now = _timeProvider.GetUtcNow();
|
||||
|
||||
// 1) Idle-timeout enforcement — the only path that rejects. A missing/unparsable
|
||||
// last-activity anchor is treated as timed-out (fail-closed): a session we
|
||||
// cannot age must not be kept alive forever.
|
||||
if (IsIdleTimedOut(principal, now))
|
||||
{
|
||||
_logger.LogInformation(
|
||||
"Cookie session for {Username} rejected: past the {IdleTimeout}-minute idle timeout.",
|
||||
principal.FindFirst(JwtTokenService.UsernameClaimType)?.Value ?? "(unknown)",
|
||||
_options.IdleTimeoutMinutes);
|
||||
return SessionValidationResult.Reject;
|
||||
}
|
||||
|
||||
// 2) Role-mapping refresh — best-effort. Any failure keeps the existing session.
|
||||
try
|
||||
{
|
||||
var refreshed = await TryRefreshAsync(principal, now, ct).ConfigureAwait(false);
|
||||
if (refreshed is not null)
|
||||
{
|
||||
return SessionValidationResult.Replace(refreshed);
|
||||
}
|
||||
}
|
||||
catch (Exception ex)
|
||||
{
|
||||
// SECURITY: never broaden access and never sign the user out on a transient
|
||||
// refresh fault — keep the existing principal (current roles) and swallow.
|
||||
_logger.LogWarning(
|
||||
ex,
|
||||
"Mid-session role refresh failed for {Username}; keeping existing session and roles.",
|
||||
principal.FindFirst(JwtTokenService.UsernameClaimType)?.Value ?? "(unknown)");
|
||||
return SessionValidationResult.Keep;
|
||||
}
|
||||
|
||||
return SessionValidationResult.Keep;
|
||||
}
|
||||
|
||||
/// <summary>
|
||||
/// Returns true when the session's last-activity anchor is older than
|
||||
/// <see cref="SecurityOptions.IdleTimeoutMinutes"/>. A missing/unparsable anchor is
|
||||
/// treated as timed-out (fail-closed).
|
||||
/// </summary>
|
||||
/// <param name="principal">The cookie principal.</param>
|
||||
/// <param name="now">The current instant.</param>
|
||||
/// <returns><c>true</c> if the session has exceeded the idle window.</returns>
|
||||
public bool IsIdleTimedOut(ClaimsPrincipal principal, DateTimeOffset now)
|
||||
{
|
||||
var claim = principal.FindFirst(JwtTokenService.LastActivityClaimType);
|
||||
if (claim is null || !DateTimeOffset.TryParse(claim.Value, out var lastActivity))
|
||||
{
|
||||
return true;
|
||||
}
|
||||
|
||||
return (now - lastActivity).TotalMinutes > _options.IdleTimeoutMinutes;
|
||||
}
|
||||
|
||||
// Returns a rebuilt principal when the role-refresh interval has elapsed; null when
|
||||
// nothing changed. The principal is rebuilt via SessionClaimBuilder so its shape is
|
||||
// identical to /auth/login.
|
||||
private async Task<ClaimsPrincipal?> TryRefreshAsync(ClaimsPrincipal principal, DateTimeOffset now, CancellationToken ct)
|
||||
{
|
||||
var roleRefreshDue = IsRoleRefreshDue(principal, now);
|
||||
if (!roleRefreshDue)
|
||||
{
|
||||
// No mapping refresh due. We deliberately do NOT mint a new principal just to
|
||||
// advance LastActivity: the cookie middleware's SlidingExpiration already
|
||||
// renews the cookie window on activity, so the idle anchor only needs
|
||||
// advancing when we are rebuilding the principal anyway (on a role refresh).
|
||||
// This keeps the no-op request path allocation-free and avoids a cookie
|
||||
// re-issue on every request.
|
||||
return null;
|
||||
}
|
||||
|
||||
var username = principal.FindFirst(JwtTokenService.UsernameClaimType)?.Value;
|
||||
var displayName = principal.FindFirst(JwtTokenService.DisplayNameClaimType)?.Value;
|
||||
if (string.IsNullOrEmpty(username) || string.IsNullOrEmpty(displayName))
|
||||
{
|
||||
// Malformed principal — cannot rebuild faithfully. Keep it (do not reject).
|
||||
_logger.LogWarning("Cannot refresh role mapping: principal is missing username/display-name claims.");
|
||||
return null;
|
||||
}
|
||||
|
||||
var groups = SessionClaimBuilder.ReadGroups(principal);
|
||||
|
||||
// Re-run the DB-backed mapping on the STORED groups — NO LDAP call.
|
||||
var mapping = await _roleMapper.MapAsync(groups, ct).ConfigureAwait(false);
|
||||
var scope = mapping.Scope is RoleMappingResult mapped
|
||||
? mapped
|
||||
: new RoleMappingResult(mapping.Roles, [], IsSystemWideDeployment: false);
|
||||
|
||||
// Rebuild identically to /auth/login, advancing BOTH anchors: the role-refresh
|
||||
// anchor (we just refreshed) and the idle anchor (this is a genuine request).
|
||||
return SessionClaimBuilder.Build(username, displayName, groups, scope, now);
|
||||
}
|
||||
|
||||
/// <summary>
|
||||
/// Returns true when the elapsed time since the last role refresh exceeds
|
||||
/// <see cref="SecurityOptions.RoleRefreshThresholdMinutes"/>. A missing/unparsable
|
||||
/// anchor is treated as due (refresh now and re-stamp the anchor).
|
||||
/// </summary>
|
||||
/// <param name="principal">The cookie principal.</param>
|
||||
/// <param name="now">The current instant.</param>
|
||||
/// <returns><c>true</c> if a role-mapping refresh is due.</returns>
|
||||
public bool IsRoleRefreshDue(ClaimsPrincipal principal, DateTimeOffset now)
|
||||
{
|
||||
var claim = principal.FindFirst(JwtTokenService.LastRoleRefreshClaimType);
|
||||
if (claim is null || !DateTimeOffset.TryParse(claim.Value, out var lastRefresh))
|
||||
{
|
||||
return true;
|
||||
}
|
||||
|
||||
return (now - lastRefresh).TotalMinutes > _options.RoleRefreshThresholdMinutes;
|
||||
}
|
||||
}
|
||||
@@ -29,6 +29,22 @@ public class JwtTokenService
|
||||
public const string SiteIdClaimType = ZbClaimTypes.ScopeId;
|
||||
public const string LastActivityClaimType = "LastActivity";
|
||||
|
||||
// M2.19 (#15): the cookie session now stores the user's raw LDAP groups and a
|
||||
// role-mapping refresh anchor so an active interactive session can re-run the
|
||||
// DB-backed RoleMapper (NOT LDAP) mid-session and pick up central role-mapping
|
||||
// changes. These two have no canonical ZbClaimTypes equivalent (the shared
|
||||
// vocabulary covers identity/role/scope, not the ScadaBridge-internal refresh
|
||||
// machinery), so they keep "zb:"-prefixed ScadaBridge-local literals:
|
||||
// - GroupClaimType ("zb:group", one per LDAP group) is the input the
|
||||
// mid-session RoleMapper re-run consumes — the groups are the durable
|
||||
// fact; the roles are the derived projection that can go stale.
|
||||
// - LastRoleRefreshClaimType ("zb:lastrolerefresh", ISO-8601 "o") anchors
|
||||
// the role-mapping refresh interval (SecurityOptions.RoleRefreshThresholdMinutes).
|
||||
// LastActivityClaimType (above) remains the idle-timeout anchor — a separate
|
||||
// clock from the role-refresh anchor.
|
||||
public const string GroupClaimType = "zb:group";
|
||||
public const string LastRoleRefreshClaimType = "zb:lastrolerefresh";
|
||||
|
||||
/// <summary>
|
||||
/// Fixed issuer bound into every token and required on validation. Binding
|
||||
/// issuer/audience is defence-in-depth: even though the HMAC key is shared only
|
||||
|
||||
@@ -1,10 +1,21 @@
|
||||
namespace ZB.MOM.WW.ScadaBridge.Security;
|
||||
|
||||
/// <summary>
|
||||
/// Non-LDAP security configuration: the cookie-embedded JWT signing/lifetime
|
||||
/// settings and the session idle-timeout / cookie-security policy.
|
||||
/// Non-LDAP security configuration for the ScadaBridge Central UI.
|
||||
/// </summary>
|
||||
/// <remarks>
|
||||
/// <para>
|
||||
/// <b>JWT Bearer path (<c>/auth/token</c>)</b>: <see cref="JwtSigningKey"/> and
|
||||
/// <see cref="JwtExpiryMinutes"/> govern the short-lived Bearer token issued to
|
||||
/// the CLI / Inbound API. They have no effect on the Blazor cookie session.
|
||||
/// </para>
|
||||
/// <para>
|
||||
/// <b>Blazor cookie session</b>: <see cref="IdleTimeoutMinutes"/> and
|
||||
/// <see cref="RoleRefreshThresholdMinutes"/> govern the cookie-only session used by
|
||||
/// the Blazor Server UI. There is no embedded JWT in this path — the cookie is
|
||||
/// HttpOnly/Secure and managed entirely by ASP.NET Core cookie authentication.
|
||||
/// </para>
|
||||
/// <para>
|
||||
/// Task 1.2/1.4 cutover: the LDAP connection settings that used to live here as
|
||||
/// flat <c>Ldap*</c> keys (server, port, transport, search base, service account,
|
||||
/// attributes, timeout) moved into a nested <c>ScadaBridge:Security:Ldap</c>
|
||||
@@ -12,6 +23,7 @@ namespace ZB.MOM.WW.ScadaBridge.Security;
|
||||
/// and registered via <c>AddZbLdapAuth</c>. This is a BREAKING config-key change —
|
||||
/// see CHANGELOG. The non-LDAP fields below are unchanged and still bound from
|
||||
/// <c>ScadaBridge:Security</c>.
|
||||
/// </para>
|
||||
/// </remarks>
|
||||
public class SecurityOptions
|
||||
{
|
||||
@@ -27,7 +39,19 @@ public class SecurityOptions
|
||||
public const int MinJwtSigningKeyBytes = 32;
|
||||
/// <summary>Cookie-embedded JWT lifetime in minutes before it must be refreshed.</summary>
|
||||
public int JwtExpiryMinutes { get; set; } = 15;
|
||||
/// <summary>Session idle timeout in minutes; sessions inactive beyond this are expired.</summary>
|
||||
/// <summary>
|
||||
/// Session idle timeout in minutes for the Blazor cookie session; sessions inactive
|
||||
/// beyond this are expired and the user is redirected to <c>/login</c>. Default: <b>30</b>.
|
||||
/// </summary>
|
||||
/// <remarks>
|
||||
/// Because <see cref="RoleRefreshThresholdMinutes"/> is the only operation that advances
|
||||
/// the <c>LastActivity</c> anchor, the effective maximum idle window before a session is
|
||||
/// guaranteed to be rejected is approximately
|
||||
/// <c>IdleTimeoutMinutes + RoleRefreshThresholdMinutes</c> (~45 minutes with defaults).
|
||||
/// This is intentional and mirrors the cookie middleware's own <c>SlidingExpiration</c>
|
||||
/// fuzziness. Must be strictly greater than <see cref="RoleRefreshThresholdMinutes"/>
|
||||
/// (enforced at startup by <see cref="SecurityOptionsValidator"/>).
|
||||
/// </remarks>
|
||||
public int IdleTimeoutMinutes { get; set; } = 30;
|
||||
|
||||
/// <summary>
|
||||
@@ -35,6 +59,28 @@ public class SecurityOptions
|
||||
/// </summary>
|
||||
public int JwtRefreshThresholdMinutes { get; set; } = 5;
|
||||
|
||||
/// <summary>
|
||||
/// M2.19 (#15): how long a cookie session's role-mapping projection may be stale
|
||||
/// before <c>OnValidatePrincipal</c> re-runs the DB-backed <c>RoleMapper</c> on the
|
||||
/// session's stored LDAP group claims and rebuilds the role/scope claims. Default:
|
||||
/// <b>15 minutes</b>, matching the documented sliding-refresh cadence.
|
||||
/// </summary>
|
||||
/// <remarks>
|
||||
/// This is a purely central (database) refresh — it picks up LDAP-group→role mapping
|
||||
/// changes and scope-rule changes WITHOUT contacting LDAP, so revoked roles take effect
|
||||
/// within this window. It does NOT pick up live LDAP group-membership changes (the
|
||||
/// shared LDAP library exposes no passwordless group-search; that remains a
|
||||
/// next-login refresh — see Component-Security.md).
|
||||
/// <para>
|
||||
/// Because a role-refresh is also the only operation that advances the
|
||||
/// <c>LastActivity</c> anchor, the effective maximum idle window is approximately
|
||||
/// <c><see cref="IdleTimeoutMinutes"/> + RoleRefreshThresholdMinutes</c> (~45 minutes
|
||||
/// with defaults). Must be strictly less than <see cref="IdleTimeoutMinutes"/>
|
||||
/// (enforced at startup by <see cref="SecurityOptionsValidator"/>).
|
||||
/// </para>
|
||||
/// </remarks>
|
||||
public int RoleRefreshThresholdMinutes { get; set; } = 15;
|
||||
|
||||
/// <summary>
|
||||
/// When true (default) the authentication cookie is always marked
|
||||
/// <c>Secure</c> (sent only over HTTPS) — the correct production setting,
|
||||
@@ -59,3 +105,38 @@ public class SecurityOptions
|
||||
/// </summary>
|
||||
public string CookieName { get; set; } = DefaultCookieName;
|
||||
}
|
||||
|
||||
/// <summary>
|
||||
/// M2.19 (#15): startup validator for <see cref="SecurityOptions"/>. Fails fast at boot
|
||||
/// on any configuration that would defeat idle-timeout enforcement.
|
||||
/// </summary>
|
||||
/// <remarks>
|
||||
/// Registered with <c>ValidateOnStart()</c> by
|
||||
/// <see cref="ServiceCollectionExtensions.AddSecurity"/> so a misconfigured appsettings
|
||||
/// section is caught at application startup rather than silently misapplied at runtime.
|
||||
/// </remarks>
|
||||
public sealed class SecurityOptionsValidator : Microsoft.Extensions.Options.IValidateOptions<SecurityOptions>
|
||||
{
|
||||
/// <inheritdoc/>
|
||||
public Microsoft.Extensions.Options.ValidateOptionsResult Validate(string? name, SecurityOptions options)
|
||||
{
|
||||
// SECURITY: RoleRefreshThresholdMinutes must be strictly less than IdleTimeoutMinutes.
|
||||
// The role-refresh cycle is the ONLY operation that advances the LastActivity anchor,
|
||||
// so a single un-refreshed cycle must not be able to exhaust the entire idle window.
|
||||
// If threshold >= idle, a user who triggers exactly one refresh at t=0 would have
|
||||
// their anchor advanced to t=threshold while the idle check only fires at t>idle —
|
||||
// meaning t=threshold >= t=idle is already past (or at) the expiry, defeating enforcement.
|
||||
if (options.RoleRefreshThresholdMinutes >= options.IdleTimeoutMinutes)
|
||||
{
|
||||
return Microsoft.Extensions.Options.ValidateOptionsResult.Fail(
|
||||
$"{nameof(SecurityOptions.RoleRefreshThresholdMinutes)} " +
|
||||
$"({options.RoleRefreshThresholdMinutes}) must be strictly less than " +
|
||||
$"{nameof(SecurityOptions.IdleTimeoutMinutes)} " +
|
||||
$"({options.IdleTimeoutMinutes}). " +
|
||||
$"A single refresh cycle must not equal or exceed the idle window or idle " +
|
||||
$"enforcement is defeated.");
|
||||
}
|
||||
|
||||
return Microsoft.Extensions.Options.ValidateOptionsResult.Success;
|
||||
}
|
||||
}
|
||||
|
||||
@@ -1,5 +1,8 @@
|
||||
using Microsoft.AspNetCore.Authentication;
|
||||
using Microsoft.AspNetCore.Authentication.Cookies;
|
||||
using Microsoft.AspNetCore.Http;
|
||||
using Microsoft.Extensions.DependencyInjection;
|
||||
using Microsoft.Extensions.DependencyInjection.Extensions;
|
||||
using Microsoft.Extensions.Logging;
|
||||
using Microsoft.Extensions.Options;
|
||||
using ZB.MOM.WW.Auth.Abstractions.Roles;
|
||||
@@ -51,6 +54,14 @@ public static class ServiceCollectionExtensions
|
||||
services.AddScoped<JwtTokenService>();
|
||||
services.AddScoped<RoleMapper>();
|
||||
|
||||
// M2.19 (#15): the cookie OnValidatePrincipal core. Scoped to match the
|
||||
// IGroupRoleMapper<string> it depends on (which depends on the Scoped
|
||||
// ISecurityRepository). The clock is injected (TimeProvider) so the idle/refresh
|
||||
// thresholds can be exercised deterministically in tests; the production default
|
||||
// is the wall clock. TryAddSingleton keeps the Host free to register its own.
|
||||
services.TryAddSingleton(TimeProvider.System);
|
||||
services.AddScoped<CookieSessionValidator>();
|
||||
|
||||
// Audit Actor wiring (Phase 3): the user-facing inbound API audit path
|
||||
// sources AuditEvent.Actor from the authenticated principal via this
|
||||
// seam. HttpAuditActorAccessor reads IHttpContextAccessor.HttpContext?.User
|
||||
@@ -71,6 +82,16 @@ public static class ServiceCollectionExtensions
|
||||
// to consume this seam in a later task.
|
||||
services.AddScoped<IGroupRoleMapper<string>, ScadaBridgeGroupRoleMapper>();
|
||||
|
||||
// M2.19 (#15): fail-fast config guard — RoleRefreshThresholdMinutes must be strictly
|
||||
// less than IdleTimeoutMinutes. If they are equal or inverted, a single un-refreshed
|
||||
// cycle can exhaust the entire idle window and idle enforcement is silently defeated.
|
||||
// SecurityOptionsValidator is registered with ValidateOnStart so a misconfigured
|
||||
// appsettings section fails at boot with a clear message rather than behaving subtly
|
||||
// incorrectly at runtime. Config-binding stays with the Host (component library must
|
||||
// not take IConfiguration), so we only register the validator + ValidateOnStart here.
|
||||
services.AddOptions<SecurityOptions>().ValidateOnStart();
|
||||
services.AddSingleton<IValidateOptions<SecurityOptions>, SecurityOptionsValidator>();
|
||||
|
||||
// Note: the old SecurityOptionsValidator (which fail-fast-validated LdapServer +
|
||||
// LdapSearchBase) is gone — those keys moved into the shared LdapOptions, whose
|
||||
// LdapOptionsValidator (registered with ValidateOnStart by AddZbLdapAuth above)
|
||||
@@ -94,6 +115,16 @@ public static class ServiceCollectionExtensions
|
||||
// environments sharing a hostname can be given distinct names. HttpOnly /
|
||||
// SameSite / SecurePolicy / SlidingExpiration / ExpireTimeSpan are likewise
|
||||
// applied there via ZbCookieDefaults.Apply.
|
||||
|
||||
// M2.19 (#15): OnValidatePrincipal enforces the idle timeout and refreshes
|
||||
// the role/scope claims from the session's STORED LDAP groups (DB-backed
|
||||
// RoleMapper, NO LDAP) so central role-mapping changes take effect
|
||||
// mid-session. The lambda is a THIN adapter: it resolves the request-scoped
|
||||
// CookieSessionValidator (which holds all the testable idle/refresh logic)
|
||||
// and translates its decision into the cookie context calls. It NEVER
|
||||
// throws — CookieSessionValidator.ValidateAsync swallows refresh faults and
|
||||
// keeps the session (mirrors "LDAP failure: active sessions continue").
|
||||
options.Events.OnValidatePrincipal = OnValidatePrincipalAsync;
|
||||
});
|
||||
|
||||
// CentralUI-005: configure the cookie session as a sliding window so the
|
||||
@@ -152,6 +183,70 @@ public static class ServiceCollectionExtensions
|
||||
return services;
|
||||
}
|
||||
|
||||
/// <summary>
|
||||
/// M2.19 (#15): the thin <see cref="CookieAuthenticationEvents.OnValidatePrincipal"/>
|
||||
/// adapter. It resolves the request-scoped <see cref="CookieSessionValidator"/>,
|
||||
/// asks it for a decision, and applies it to the cookie context:
|
||||
/// <list type="bullet">
|
||||
/// <item><see cref="SessionValidationAction.Reject"/> → <see cref="CookieValidatePrincipalContext.RejectPrincipal"/> + sign out (idle-timeout — the only sign-out path).</item>
|
||||
/// <item><see cref="SessionValidationAction.Replace"/> → <see cref="CookieValidatePrincipalContext.ReplacePrincipal"/> + <c>ShouldRenew = true</c> (role mapping refreshed).</item>
|
||||
/// <item><see cref="SessionValidationAction.Keep"/> → no-op (no refresh due, or a swallowed refresh fault).</item>
|
||||
/// </list>
|
||||
/// All logic lives in <see cref="CookieSessionValidator.ValidateAsync"/>, which never
|
||||
/// throws, so this adapter cannot bubble an exception out into the request pipeline.
|
||||
/// </summary>
|
||||
/// <param name="context">The cookie validation context supplied by the middleware.</param>
|
||||
/// <returns>A task that completes when the decision has been applied.</returns>
|
||||
internal static async Task OnValidatePrincipalAsync(CookieValidatePrincipalContext context)
|
||||
{
|
||||
var validator = context.HttpContext.RequestServices.GetRequiredService<CookieSessionValidator>();
|
||||
|
||||
var result = await validator
|
||||
.ValidateAsync(context.Principal, context.HttpContext.RequestAborted)
|
||||
.ConfigureAwait(false);
|
||||
|
||||
await ApplyValidationResultAsync(context, result).ConfigureAwait(false);
|
||||
}
|
||||
|
||||
/// <summary>
|
||||
/// Applies a <see cref="SessionValidationResult"/> to a
|
||||
/// <see cref="CookieValidatePrincipalContext"/>: the pure decision-application
|
||||
/// step extracted from <see cref="OnValidatePrincipalAsync"/> so it can be
|
||||
/// exercised in unit tests without a live DI container resolving
|
||||
/// <see cref="CookieSessionValidator"/>.
|
||||
/// </summary>
|
||||
/// <param name="context">The cookie validation context to mutate.</param>
|
||||
/// <param name="result">The decision produced by <see cref="CookieSessionValidator.ValidateAsync"/>.</param>
|
||||
/// <returns>A task that completes when the result has been applied.</returns>
|
||||
internal static async Task ApplyValidationResultAsync(
|
||||
CookieValidatePrincipalContext context,
|
||||
SessionValidationResult result)
|
||||
{
|
||||
switch (result.Action)
|
||||
{
|
||||
case SessionValidationAction.Reject:
|
||||
// Idle-timeout: drop the principal AND clear the cookie so the next
|
||||
// request is treated as anonymous and redirected to /login.
|
||||
context.RejectPrincipal();
|
||||
await context.HttpContext
|
||||
.SignOutAsync(CookieAuthenticationDefaults.AuthenticationScheme)
|
||||
.ConfigureAwait(false);
|
||||
break;
|
||||
|
||||
case SessionValidationAction.Replace when result.Principal is not null:
|
||||
// Role mapping refreshed from stored groups — swap in the rebuilt
|
||||
// principal and re-issue the cookie so the new claims persist.
|
||||
context.ReplacePrincipal(result.Principal);
|
||||
context.ShouldRenew = true;
|
||||
break;
|
||||
|
||||
case SessionValidationAction.Keep:
|
||||
default:
|
||||
// Leave the principal untouched.
|
||||
break;
|
||||
}
|
||||
}
|
||||
|
||||
/// <summary>
|
||||
/// Registers security-related Akka actors (placeholder for future actor registrations).
|
||||
/// </summary>
|
||||
|
||||
@@ -0,0 +1,116 @@
|
||||
using System.Security.Claims;
|
||||
using Microsoft.AspNetCore.Authentication.Cookies;
|
||||
|
||||
namespace ZB.MOM.WW.ScadaBridge.Security;
|
||||
|
||||
/// <summary>
|
||||
/// M2.19 (#15): the single, shared source of truth for the FULL set of claims that
|
||||
/// back an interactive cookie session. BOTH the <c>/auth/login</c> endpoint and the
|
||||
/// <c>OnValidatePrincipal</c> mid-session role-refresh path build their principal
|
||||
/// through <see cref="Build"/>, so the two can never drift — the spec requires the
|
||||
/// refresh to "rebuild claims identically to /auth/login".
|
||||
/// </summary>
|
||||
/// <remarks>
|
||||
/// The claim shape is exactly what the login endpoint historically minted, plus the
|
||||
/// two M2.19 additions:
|
||||
/// <list type="bullet">
|
||||
/// <item><see cref="ClaimTypes.Name"/> — resolves <c>Identity.Name</c>.</item>
|
||||
/// <item><see cref="JwtTokenService.DisplayNameClaimType"/> — human display name.</item>
|
||||
/// <item><see cref="JwtTokenService.UsernameClaimType"/> — canonical username.</item>
|
||||
/// <item><see cref="JwtTokenService.RoleClaimType"/> — one per mapped role.</item>
|
||||
/// <item><see cref="JwtTokenService.SiteIdClaimType"/> — one per permitted site,
|
||||
/// ONLY when the mapping is not system-wide (deny-by-omission preserved).</item>
|
||||
/// <item><see cref="JwtTokenService.GroupClaimType"/> — one per raw LDAP group
|
||||
/// (M2.19): the durable input the mid-session RoleMapper re-run consumes.</item>
|
||||
/// <item><see cref="JwtTokenService.LastRoleRefreshClaimType"/> — the role-mapping
|
||||
/// refresh anchor (M2.19), ISO-8601 round-trippable.</item>
|
||||
/// <item><see cref="JwtTokenService.LastActivityClaimType"/> — the idle-timeout
|
||||
/// anchor; seeded to the refresh timestamp at login so idle-timeout can be
|
||||
/// enforced consistently from the very first request.</item>
|
||||
/// </list>
|
||||
/// The <see cref="ClaimsIdentity"/> is built with <c>nameType = ClaimTypes.Name</c>
|
||||
/// and <c>roleType = RoleClaimType</c> so <c>Identity.Name</c> / <c>IsInRole</c> /
|
||||
/// <c>[Authorize(Roles=…)]</c> resolve against exactly the canonical types minted here.
|
||||
/// </remarks>
|
||||
public static class SessionClaimBuilder
|
||||
{
|
||||
/// <summary>
|
||||
/// Builds the full cookie-session <see cref="ClaimsPrincipal"/> from the resolved
|
||||
/// identity, the raw LDAP groups, the DB-backed role mapping, and the refresh
|
||||
/// timestamp. Used identically by <c>/auth/login</c> and the
|
||||
/// <c>OnValidatePrincipal</c> refresh path so the two cannot diverge.
|
||||
/// </summary>
|
||||
/// <param name="username">The canonical authenticated username (becomes <see cref="ClaimTypes.Name"/> + <see cref="JwtTokenService.UsernameClaimType"/>).</param>
|
||||
/// <param name="displayName">The human-readable display name.</param>
|
||||
/// <param name="groups">The user's raw LDAP groups, stored one per <see cref="JwtTokenService.GroupClaimType"/> claim.</param>
|
||||
/// <param name="mapping">The DB-backed role mapping (roles + permitted sites + system-wide flag).</param>
|
||||
/// <param name="refreshTimestamp">The role-mapping refresh anchor; also seeds the last-activity anchor.</param>
|
||||
/// <param name="authenticationType">The authentication type stamped on the identity (defaults to the cookie scheme).</param>
|
||||
/// <returns>A fully populated cookie <see cref="ClaimsPrincipal"/>.</returns>
|
||||
public static ClaimsPrincipal Build(
|
||||
string username,
|
||||
string displayName,
|
||||
IReadOnlyList<string> groups,
|
||||
RoleMappingResult mapping,
|
||||
DateTimeOffset refreshTimestamp,
|
||||
string authenticationType = CookieAuthenticationDefaults.AuthenticationScheme)
|
||||
{
|
||||
ArgumentNullException.ThrowIfNull(username);
|
||||
ArgumentNullException.ThrowIfNull(displayName);
|
||||
ArgumentNullException.ThrowIfNull(groups);
|
||||
ArgumentNullException.ThrowIfNull(mapping);
|
||||
|
||||
var refreshStamp = refreshTimestamp.ToString("o");
|
||||
|
||||
var claims = new List<Claim>
|
||||
{
|
||||
new(ClaimTypes.Name, username),
|
||||
new(JwtTokenService.DisplayNameClaimType, displayName),
|
||||
new(JwtTokenService.UsernameClaimType, username),
|
||||
// Role-refresh anchor AND idle anchor are seeded from the same instant at
|
||||
// build time. They then diverge: OnValidatePrincipal advances LastActivity
|
||||
// on every request but only advances LastRoleRefresh when it actually
|
||||
// re-runs the mapping.
|
||||
new(JwtTokenService.LastRoleRefreshClaimType, refreshStamp),
|
||||
new(JwtTokenService.LastActivityClaimType, refreshStamp),
|
||||
};
|
||||
|
||||
foreach (var role in mapping.Roles)
|
||||
{
|
||||
claims.Add(new Claim(JwtTokenService.RoleClaimType, role));
|
||||
}
|
||||
|
||||
// Deny-by-omission: only stamp SiteId claims for a non-system-wide mapping.
|
||||
if (!mapping.IsSystemWideDeployment)
|
||||
{
|
||||
foreach (var siteId in mapping.PermittedSiteIds)
|
||||
{
|
||||
claims.Add(new Claim(JwtTokenService.SiteIdClaimType, siteId));
|
||||
}
|
||||
}
|
||||
|
||||
// Store the raw LDAP groups so the mid-session refresh can re-run the
|
||||
// DB-backed RoleMapper without any LDAP round-trip.
|
||||
foreach (var group in groups)
|
||||
{
|
||||
claims.Add(new Claim(JwtTokenService.GroupClaimType, group));
|
||||
}
|
||||
|
||||
var identity = new ClaimsIdentity(
|
||||
claims,
|
||||
authenticationType: authenticationType,
|
||||
nameType: ClaimTypes.Name,
|
||||
roleType: JwtTokenService.RoleClaimType);
|
||||
|
||||
return new ClaimsPrincipal(identity);
|
||||
}
|
||||
|
||||
/// <summary>Reads the stored LDAP group claims (<see cref="JwtTokenService.GroupClaimType"/>) off a principal.</summary>
|
||||
/// <param name="principal">The cookie principal to read from.</param>
|
||||
/// <returns>The stored LDAP group names; empty if none were stored.</returns>
|
||||
public static IReadOnlyList<string> ReadGroups(ClaimsPrincipal principal)
|
||||
{
|
||||
ArgumentNullException.ThrowIfNull(principal);
|
||||
return principal.FindAll(JwtTokenService.GroupClaimType).Select(c => c.Value).ToList();
|
||||
}
|
||||
}
|
||||
@@ -35,4 +35,10 @@
|
||||
<ProjectReference Include="../ZB.MOM.WW.ScadaBridge.Commons/ZB.MOM.WW.ScadaBridge.Commons.csproj" />
|
||||
</ItemGroup>
|
||||
|
||||
<ItemGroup>
|
||||
<!-- M2.19 (#15): expose internal members (OnValidatePrincipalAsync adapter) to the
|
||||
Security test project so the adapter translation can be exercised in isolation. -->
|
||||
<InternalsVisibleTo Include="ZB.MOM.WW.ScadaBridge.Security.Tests" />
|
||||
</ItemGroup>
|
||||
|
||||
</Project>
|
||||
|
||||
@@ -32,10 +32,9 @@ public interface ISiteEventLogger
|
||||
/// <summary>
|
||||
/// SiteEventLogging-018: total number of event writes that have failed
|
||||
/// (SQLite error, disk full, bounded-queue overflow drop, etc.) since this
|
||||
/// logger was created. Available for future Health Monitoring integration —
|
||||
/// promoted onto the interface so a Health consumer can read it without a
|
||||
/// concrete-type downcast. Not yet polled by Health Monitoring; the wiring
|
||||
/// is tracked separately.
|
||||
/// logger was created. Polled by <c>SiteEventLogFailureCountReporter</c>
|
||||
/// (HealthMonitoring — M2.16 / #30) every 30 s and surfaced on the site
|
||||
/// health report as <c>SiteHealthReport.SiteEventLogWriteFailures</c>.
|
||||
/// </summary>
|
||||
long FailedWriteCount { get; }
|
||||
}
|
||||
|
||||
@@ -72,6 +72,15 @@ public class AlarmActor : ReceiveActor
|
||||
private readonly string? _onTriggerScriptName;
|
||||
private readonly Script<object?>? _onTriggerCompiledScript;
|
||||
|
||||
/// <summary>
|
||||
/// M2.5 (#9): the on-trigger script's per-script execution timeout in seconds,
|
||||
/// or null to use the global default. Forwarded to each spawned
|
||||
/// <see cref="AlarmExecutionActor"/>, which applies <c>perScript ?? global</c>
|
||||
/// (treating ≤ 0 as "use global"). The value comes from the referenced
|
||||
/// on-trigger script's <see cref="ResolvedScript.ExecutionTimeoutSeconds"/>.
|
||||
/// </summary>
|
||||
private readonly int? _onTriggerExecutionTimeoutSeconds;
|
||||
|
||||
// Expression trigger: compiled expression + the attribute snapshot it
|
||||
// evaluates against. This field is the single home for the compiled
|
||||
// expression on the hot path.
|
||||
@@ -107,6 +116,9 @@ public class AlarmActor : ReceiveActor
|
||||
/// <param name="serviceProvider">Optional DI service provider used to resolve the optional
|
||||
/// <see cref="ISiteEventLogger"/> for M1.5 <c>alarm</c> operational events. Fire-and-forget;
|
||||
/// a logging failure never affects alarm evaluation.</param>
|
||||
/// <param name="onTriggerExecutionTimeoutSeconds">M2.5 (#9): the on-trigger script's per-script
|
||||
/// execution timeout in seconds (from its <see cref="ResolvedScript.ExecutionTimeoutSeconds"/>),
|
||||
/// or null/non-positive to use the global default.</param>
|
||||
public AlarmActor(
|
||||
string alarmName,
|
||||
string instanceName,
|
||||
@@ -119,7 +131,9 @@ public class AlarmActor : ReceiveActor
|
||||
Script<object?>? compiledTriggerExpression = null,
|
||||
IReadOnlyDictionary<string, object?>? initialAttributes = null,
|
||||
ISiteHealthCollector? healthCollector = null,
|
||||
IServiceProvider? serviceProvider = null)
|
||||
IServiceProvider? serviceProvider = null,
|
||||
// M2.5 (#9): per-script timeout for the on-trigger script (null = global).
|
||||
int? onTriggerExecutionTimeoutSeconds = null)
|
||||
{
|
||||
_alarmName = alarmName;
|
||||
_instanceName = instanceName;
|
||||
@@ -135,6 +149,7 @@ public class AlarmActor : ReceiveActor
|
||||
_priority = alarmConfig.PriorityLevel;
|
||||
_onTriggerScriptName = alarmConfig.OnTriggerScriptCanonicalName;
|
||||
_onTriggerCompiledScript = onTriggerCompiledScript;
|
||||
_onTriggerExecutionTimeoutSeconds = onTriggerExecutionTimeoutSeconds;
|
||||
_compiledTriggerExpression = compiledTriggerExpression;
|
||||
|
||||
// Seed the trigger-expression attribute snapshot from the instance's
|
||||
@@ -574,7 +589,9 @@ public class AlarmActor : ReceiveActor
|
||||
_instanceActor,
|
||||
_sharedScriptLibrary,
|
||||
_options,
|
||||
_logger));
|
||||
_logger,
|
||||
// M2.5 (#9): per-script timeout from the on-trigger script (null = global).
|
||||
_onTriggerExecutionTimeoutSeconds));
|
||||
|
||||
Context.ActorOf(props, executionId);
|
||||
}
|
||||
|
||||
@@ -28,6 +28,7 @@ public class AlarmExecutionActor : ReceiveActor
|
||||
/// <param name="sharedScriptLibrary">Shared script library providing common utilities.</param>
|
||||
/// <param name="options">Site runtime configuration options, including the execution timeout.</param>
|
||||
/// <param name="logger">Logger for execution diagnostics.</param>
|
||||
/// <param name="executionTimeoutSeconds">M2.5 (#9): the on-trigger script's per-script execution timeout in seconds. Null or non-positive falls back to the global <see cref="SiteRuntimeOptions.ScriptExecutionTimeoutSeconds"/>.</param>
|
||||
public AlarmExecutionActor(
|
||||
string alarmName,
|
||||
string instanceName,
|
||||
@@ -38,7 +39,10 @@ public class AlarmExecutionActor : ReceiveActor
|
||||
IActorRef instanceActor,
|
||||
SharedScriptLibrary sharedScriptLibrary,
|
||||
SiteRuntimeOptions options,
|
||||
ILogger logger)
|
||||
ILogger logger,
|
||||
// M2.5 (#9): per-script execution timeout override (seconds) for the
|
||||
// alarm on-trigger script. Null or non-positive falls back to the global.
|
||||
int? executionTimeoutSeconds = null)
|
||||
{
|
||||
var self = Self;
|
||||
var parent = Context.Parent;
|
||||
@@ -46,7 +50,8 @@ public class AlarmExecutionActor : ReceiveActor
|
||||
ExecuteAlarmScript(
|
||||
alarmName, instanceName, level, priority, message,
|
||||
compiledScript, instanceActor,
|
||||
sharedScriptLibrary, options, self, parent, logger);
|
||||
sharedScriptLibrary, options, self, parent, logger,
|
||||
executionTimeoutSeconds);
|
||||
}
|
||||
|
||||
private static void ExecuteAlarmScript(
|
||||
@@ -61,9 +66,15 @@ public class AlarmExecutionActor : ReceiveActor
|
||||
SiteRuntimeOptions options,
|
||||
IActorRef self,
|
||||
IActorRef parent,
|
||||
ILogger logger)
|
||||
ILogger logger,
|
||||
int? executionTimeoutSeconds)
|
||||
{
|
||||
var timeout = TimeSpan.FromSeconds(options.ScriptExecutionTimeoutSeconds);
|
||||
// M2.5 (#9): per-script timeout overrides the global default. A null or
|
||||
// non-positive per-script value (≤ 0) falls back to the global.
|
||||
var timeout = TimeSpan.FromSeconds(
|
||||
executionTimeoutSeconds is { } perScript && perScript > 0
|
||||
? perScript
|
||||
: options.ScriptExecutionTimeoutSeconds);
|
||||
|
||||
// SiteRuntime-009: run the alarm on-trigger body on the dedicated
|
||||
// script-execution scheduler, not the shared .NET thread pool.
|
||||
|
||||
@@ -895,11 +895,14 @@ public class DeploymentManagerActor : ReceiveActor, IWithTimers
|
||||
}
|
||||
else
|
||||
{
|
||||
// M2.11: set InstanceNotFound=true so the caller can distinguish
|
||||
// "not deployed on this site" from a deployed-but-empty instance.
|
||||
_logger.LogWarning(
|
||||
"Debug view subscribe for unknown instance {Instance}", request.InstanceUniqueName);
|
||||
Sender.Tell(new DebugViewSnapshot(
|
||||
request.InstanceUniqueName, Array.Empty<Commons.Messages.Streaming.AttributeValueChanged>(),
|
||||
Array.Empty<Commons.Messages.Streaming.AlarmStateChanged>(), DateTimeOffset.UtcNow));
|
||||
Array.Empty<Commons.Messages.Streaming.AlarmStateChanged>(), DateTimeOffset.UtcNow,
|
||||
InstanceNotFound: true));
|
||||
}
|
||||
}
|
||||
|
||||
@@ -919,11 +922,14 @@ public class DeploymentManagerActor : ReceiveActor, IWithTimers
|
||||
}
|
||||
else
|
||||
{
|
||||
// M2.11: set InstanceNotFound=true so the caller can distinguish
|
||||
// "not deployed on this site" from a deployed-but-empty instance.
|
||||
_logger.LogWarning(
|
||||
"Debug snapshot for unknown instance {Instance}", request.InstanceUniqueName);
|
||||
Sender.Tell(new DebugViewSnapshot(
|
||||
request.InstanceUniqueName, Array.Empty<Commons.Messages.Streaming.AttributeValueChanged>(),
|
||||
Array.Empty<Commons.Messages.Streaming.AlarmStateChanged>(), DateTimeOffset.UtcNow));
|
||||
Array.Empty<Commons.Messages.Streaming.AlarmStateChanged>(), DateTimeOffset.UtcNow,
|
||||
InstanceNotFound: true));
|
||||
}
|
||||
}
|
||||
|
||||
|
||||
@@ -754,6 +754,10 @@ public class InstanceActor : ReceiveActor
|
||||
foreach (var alarm in _configuration.Alarms)
|
||||
{
|
||||
Script<object?>? onTriggerScript = null;
|
||||
// M2.5 (#9): the on-trigger script's per-script execution timeout,
|
||||
// captured from its ResolvedScript so the AlarmExecutionActor can
|
||||
// apply perScript ?? global. Null when there is no on-trigger script.
|
||||
int? onTriggerTimeoutSeconds = null;
|
||||
|
||||
// Compile on-trigger script if defined
|
||||
if (!string.IsNullOrEmpty(alarm.OnTriggerScriptCanonicalName))
|
||||
@@ -763,6 +767,7 @@ public class InstanceActor : ReceiveActor
|
||||
|
||||
if (triggerScriptDef != null)
|
||||
{
|
||||
onTriggerTimeoutSeconds = triggerScriptDef.ExecutionTimeoutSeconds;
|
||||
var result = _compilationService.Compile(
|
||||
$"alarm-trigger-{alarm.CanonicalName}", triggerScriptDef.Code);
|
||||
if (result.IsSuccess)
|
||||
@@ -794,7 +799,9 @@ public class InstanceActor : ReceiveActor
|
||||
triggerExpression,
|
||||
attributeSnapshot,
|
||||
_healthCollector,
|
||||
_serviceProvider));
|
||||
_serviceProvider,
|
||||
// M2.5 (#9): per-script timeout for the alarm on-trigger script.
|
||||
onTriggerTimeoutSeconds));
|
||||
|
||||
var actorRef = Context.ActorOf(props, $"alarm-{alarm.CanonicalName}");
|
||||
_alarmActors[alarm.CanonicalName] = actorRef;
|
||||
|
||||
@@ -43,6 +43,13 @@ public class ScriptActor : ReceiveActor, IWithTimers
|
||||
private Script<object?>? _compiledScript;
|
||||
private ScriptTriggerConfig? _triggerConfig;
|
||||
private TimeSpan? _minTimeBetweenRuns;
|
||||
|
||||
/// <summary>
|
||||
/// M2.5 (#9): the per-script execution timeout in seconds, or null to use the
|
||||
/// global default. Threaded down to each spawned <see cref="ScriptExecutionActor"/>,
|
||||
/// which applies <c>perScript ?? global</c> (and treats ≤ 0 as "use global").
|
||||
/// </summary>
|
||||
private readonly int? _executionTimeoutSeconds;
|
||||
private DateTimeOffset _lastExecutionTime = DateTimeOffset.MinValue;
|
||||
private int _executionCounter;
|
||||
private readonly Commons.Types.Scripts.ScriptScope _scope;
|
||||
@@ -112,6 +119,7 @@ public class ScriptActor : ReceiveActor, IWithTimers
|
||||
_healthCollector = healthCollector;
|
||||
_serviceProvider = serviceProvider;
|
||||
_minTimeBetweenRuns = scriptConfig.MinTimeBetweenRuns;
|
||||
_executionTimeoutSeconds = scriptConfig.ExecutionTimeoutSeconds;
|
||||
_scope = scriptConfig.Scope;
|
||||
_compiledTriggerExpression = compiledTriggerExpression;
|
||||
|
||||
@@ -426,7 +434,9 @@ public class ScriptActor : ReceiveActor, IWithTimers
|
||||
_serviceProvider,
|
||||
// Audit Log #23 (ParentExecutionId): null for trigger-driven runs;
|
||||
// an inbound-API-routed call supplies the inbound request's id.
|
||||
parentExecutionId));
|
||||
parentExecutionId,
|
||||
// M2.5 (#9): per-script timeout override (null = use global).
|
||||
_executionTimeoutSeconds));
|
||||
|
||||
Context.ActorOf(props, executionId);
|
||||
}
|
||||
|
||||
@@ -47,6 +47,7 @@ public class ScriptExecutionActor : ReceiveActor
|
||||
/// <param name="healthCollector">Optional health collector for recording execution metrics.</param>
|
||||
/// <param name="serviceProvider">Optional DI service provider for script execution services.</param>
|
||||
/// <param name="parentExecutionId">ExecutionId of the spawning inbound-API execution for audit correlation; null for normal runs.</param>
|
||||
/// <param name="executionTimeoutSeconds">M2.5 (#9): per-script execution timeout in seconds. Null or non-positive falls back to the global <see cref="SiteRuntimeOptions.ScriptExecutionTimeoutSeconds"/>.</param>
|
||||
public ScriptExecutionActor(
|
||||
string scriptName,
|
||||
string instanceName,
|
||||
@@ -65,7 +66,10 @@ public class ScriptExecutionActor : ReceiveActor
|
||||
// Audit Log #23 (ParentExecutionId): the spawning execution's
|
||||
// ExecutionId for an inbound-API-routed call. Null for normal
|
||||
// (tag-change / timer) runs and nested Script.Call invocations.
|
||||
Guid? parentExecutionId = null)
|
||||
Guid? parentExecutionId = null,
|
||||
// M2.5 (#9): per-script execution timeout override (seconds). Null or
|
||||
// non-positive falls back to the global ScriptExecutionTimeoutSeconds.
|
||||
int? executionTimeoutSeconds = null)
|
||||
{
|
||||
// Immediately begin execution
|
||||
var self = Self;
|
||||
@@ -75,7 +79,7 @@ public class ScriptExecutionActor : ReceiveActor
|
||||
scriptName, instanceName, compiledScript, parameters, callDepth,
|
||||
instanceActor, sharedScriptLibrary, options, replyTo, correlationId,
|
||||
self, parent, logger, scope, healthCollector, serviceProvider,
|
||||
parentExecutionId);
|
||||
parentExecutionId, executionTimeoutSeconds);
|
||||
}
|
||||
|
||||
private static void ExecuteScript(
|
||||
@@ -95,9 +99,15 @@ public class ScriptExecutionActor : ReceiveActor
|
||||
Commons.Types.Scripts.ScriptScope scope,
|
||||
ISiteHealthCollector? healthCollector,
|
||||
IServiceProvider? serviceProvider,
|
||||
Guid? parentExecutionId)
|
||||
Guid? parentExecutionId,
|
||||
int? executionTimeoutSeconds)
|
||||
{
|
||||
var timeout = TimeSpan.FromSeconds(options.ScriptExecutionTimeoutSeconds);
|
||||
// M2.5 (#9): per-script timeout overrides the global default. A null or
|
||||
// non-positive per-script value (≤ 0) falls back to the global.
|
||||
var timeout = TimeSpan.FromSeconds(
|
||||
executionTimeoutSeconds is { } perScript && perScript > 0
|
||||
? perScript
|
||||
: options.ScriptExecutionTimeoutSeconds);
|
||||
|
||||
// SiteRuntime-009: run the script body on the dedicated script-execution
|
||||
// scheduler, not the shared .NET thread pool, so blocking script I/O cannot
|
||||
@@ -207,7 +217,11 @@ public class ScriptExecutionActor : ReceiveActor
|
||||
// and the four cached-call telemetry constructors can stamp
|
||||
// it onto NotificationSubmit.SourceNode and
|
||||
// SiteCallOperational.SourceNode respectively.
|
||||
sourceNode: sourceNode);
|
||||
sourceNode: sourceNode,
|
||||
// M2.12 (#25): thread the singleton site event logger so
|
||||
// recursion-limit violations at CallScript/CallShared emit a
|
||||
// script Error site event in addition to ILogger.LogError.
|
||||
siteEventLogger: siteEventLogger);
|
||||
|
||||
var globals = new ScriptGlobals
|
||||
{
|
||||
|
||||
@@ -13,6 +13,7 @@ using ZB.MOM.WW.ScadaBridge.Commons.Types;
|
||||
using ZB.MOM.WW.ScadaBridge.Commons.Types.Audit;
|
||||
using ZB.MOM.WW.ScadaBridge.Commons.Types.Enums;
|
||||
using AuditEvent = ZB.MOM.WW.Audit.AuditEvent;
|
||||
using ZB.MOM.WW.ScadaBridge.SiteEventLogging;
|
||||
using ZB.MOM.WW.ScadaBridge.StoreAndForward;
|
||||
|
||||
namespace ZB.MOM.WW.ScadaBridge.SiteRuntime.Scripts;
|
||||
@@ -94,6 +95,13 @@ public class ScriptRuntimeContext
|
||||
/// </summary>
|
||||
private readonly string? _sourceScript;
|
||||
|
||||
/// <summary>
|
||||
/// M2.12 (#25): site event logger for recording recursion-limit violations
|
||||
/// to the local SQLite event log. Optional — when null the emission is
|
||||
/// skipped; the existing <c>_logger.LogError</c> + throw path is unchanged.
|
||||
/// </summary>
|
||||
private readonly ISiteEventLogger? _siteEventLogger;
|
||||
|
||||
/// <summary>
|
||||
/// Audit Log #23: best-effort emitter for boundary-crossing actions executed
|
||||
/// by the script. Optional — when null the helpers degrade to a no-op audit
|
||||
@@ -179,6 +187,13 @@ public class ScriptRuntimeContext
|
||||
/// <paramref name="executionId"/>; this only records the spawner.
|
||||
/// </param>
|
||||
/// <param name="sourceNode">Optional cluster node identifier (node-a/node-b) for audit trail stamping.</param>
|
||||
/// <param name="siteEventLogger">
|
||||
/// M2.12 (#25): optional site event logger. When supplied, recursion-limit
|
||||
/// violations at <c>CallScript</c> and <c>CallShared</c> emit a
|
||||
/// <c>script</c> Error event in addition to the existing
|
||||
/// <c>ILogger.LogError</c> + throw. When null the existing behaviour is
|
||||
/// unchanged; all existing callers and tests remain source-compatible.
|
||||
/// </param>
|
||||
public ScriptRuntimeContext(
|
||||
IActorRef instanceActor,
|
||||
IActorRef self,
|
||||
@@ -199,7 +214,8 @@ public class ScriptRuntimeContext
|
||||
ICachedCallTelemetryForwarder? cachedForwarder = null,
|
||||
Guid? executionId = null,
|
||||
Guid? parentExecutionId = null,
|
||||
string? sourceNode = null)
|
||||
string? sourceNode = null,
|
||||
ISiteEventLogger? siteEventLogger = null)
|
||||
{
|
||||
_instanceActor = instanceActor;
|
||||
_self = self;
|
||||
@@ -227,6 +243,44 @@ public class ScriptRuntimeContext
|
||||
// Audit Log #23 (ParentExecutionId): stored verbatim — no `?? NewGuid()`
|
||||
// fallback. A non-routed run legitimately has no parent and stays null.
|
||||
_parentExecutionId = parentExecutionId;
|
||||
// M2.12 (#25): optional — null when not wired (tests / AlarmExecutionActor).
|
||||
_siteEventLogger = siteEventLogger;
|
||||
}
|
||||
|
||||
/// <summary>
|
||||
/// M2.12 (#25): fire-and-forget emission of a <c>script</c> Error site event
|
||||
/// for a recursion-limit violation. Mirrors the call shape used by
|
||||
/// <c>ScriptExecutionActor</c>'s catch blocks (WP-32 / M1.8). A fault from
|
||||
/// the site-event logger is observed-and-dropped (best-effort) via
|
||||
/// <c>ContinueWith(OnlyOnFaulted)</c> — it never blocks or faults the
|
||||
/// <c>_logger.LogError</c> + throw path that follows. A null logger is a no-op.
|
||||
/// </summary>
|
||||
private void EmitRecursionLimitEventAsync(string msg)
|
||||
{
|
||||
if (_siteEventLogger == null)
|
||||
return;
|
||||
|
||||
var source = string.IsNullOrEmpty(_instanceName)
|
||||
? "recursion-guard"
|
||||
: $"InstanceScript:{_instanceName}";
|
||||
|
||||
var logTask = _siteEventLogger.LogEventAsync("script", "Error", _instanceName, source, msg);
|
||||
if (!logTask.IsCompleted)
|
||||
{
|
||||
logTask.ContinueWith(
|
||||
t => _logger.LogWarning(t.Exception,
|
||||
"Site event log write failed for recursion-limit violation on instance '{Instance}'",
|
||||
_instanceName),
|
||||
CancellationToken.None,
|
||||
TaskContinuationOptions.OnlyOnFaulted | TaskContinuationOptions.ExecuteSynchronously,
|
||||
TaskScheduler.Default);
|
||||
}
|
||||
else if (logTask.IsFaulted)
|
||||
{
|
||||
_logger.LogWarning(logTask.Exception,
|
||||
"Site event log write failed for recursion-limit violation on instance '{Instance}'",
|
||||
_instanceName);
|
||||
}
|
||||
}
|
||||
|
||||
/// <summary>
|
||||
@@ -302,6 +356,8 @@ public class ScriptRuntimeContext
|
||||
var msg = $"Script call depth exceeded maximum of {_maxCallDepth}. " +
|
||||
$"CallScript('{scriptName}') rejected at depth {nextDepth}.";
|
||||
_logger.LogError(msg);
|
||||
// M2.12 (#25): emit to site event log in addition to ILogger; fire-and-forget.
|
||||
EmitRecursionLimitEventAsync(msg);
|
||||
throw new InvalidOperationException(msg);
|
||||
}
|
||||
|
||||
@@ -464,6 +520,9 @@ public class ScriptRuntimeContext
|
||||
var msg = $"Script call depth exceeded maximum of {_maxCallDepth}. " +
|
||||
$"CallShared('{scriptName}') rejected at depth {nextDepth}.";
|
||||
_logger.LogError(msg);
|
||||
// M2.12 (#25): emit to site event log via the parent context's
|
||||
// helper — single emission path, fire-and-forget.
|
||||
_context.EmitRecursionLimitEventAsync(msg);
|
||||
throw new InvalidOperationException(msg);
|
||||
}
|
||||
|
||||
@@ -1326,9 +1385,20 @@ public class ScriptRuntimeContext
|
||||
name, trackedId, target, occurredAtUtc, cancellationToken)
|
||||
.ConfigureAwait(false);
|
||||
|
||||
// M2.3 (#7): the gateway now attempts the write immediately and
|
||||
// classifies the outcome (mirroring ExternalSystem.CachedCall). The
|
||||
// result is retained because the immediate paths (WasBuffered=false —
|
||||
// immediate success OR a synchronous permanent failure) bypass the
|
||||
// S&F retry loop entirely, so no retry-loop telemetry ever fires.
|
||||
// This helper must emit the Attempted + CachedResolve terminal rows
|
||||
// itself, otherwise Tracking.Status(id) would stay Submitted forever
|
||||
// and the audit log would be missing the terminal lifecycle. The
|
||||
// WasBuffered=true path is unaffected — the S&F retry loop owns the
|
||||
// Attempted + Resolve emissions there.
|
||||
ExternalCallResult? result;
|
||||
try
|
||||
{
|
||||
await _gateway.CachedWriteAsync(
|
||||
result = await _gateway.CachedWriteAsync(
|
||||
name, sql, parameters, _instanceName, cancellationToken, trackedId,
|
||||
// Audit Log #23 (ExecutionId Task 4): thread the script
|
||||
// execution's ExecutionId + SourceScript so a buffered
|
||||
@@ -1350,9 +1420,148 @@ public class ScriptRuntimeContext
|
||||
throw;
|
||||
}
|
||||
|
||||
// M2.3 (#7): immediate-completion lifecycle — emit the missing
|
||||
// Attempted + CachedResolve rows when the underlying write resolved
|
||||
// without engaging the store-and-forward retry loop (immediate
|
||||
// success or a synchronous permanent failure).
|
||||
if (result is { WasBuffered: false })
|
||||
{
|
||||
await EmitImmediateDbTerminalTelemetryAsync(
|
||||
name, target, trackedId, result, cancellationToken)
|
||||
.ConfigureAwait(false);
|
||||
}
|
||||
|
||||
return trackedId;
|
||||
}
|
||||
|
||||
/// <summary>
|
||||
/// M2.3 (#7): best-effort emission of the immediate-completion lifecycle
|
||||
/// for a <c>Database.CachedWrite</c> that resolved without the S&F
|
||||
/// retry loop — emits an <c>Attempted</c> row then a terminal
|
||||
/// <c>CachedResolve</c> row (<c>Delivered</c> on success, <c>Failed</c> on
|
||||
/// a synchronous permanent SQL error). The DB parallel of
|
||||
/// <see cref="EmitImmediateTerminalTelemetryAsync"/>. Any forwarder
|
||||
/// failure is logged and swallowed (alog.md §7).
|
||||
/// </summary>
|
||||
private async Task EmitImmediateDbTerminalTelemetryAsync(
|
||||
string connectionName,
|
||||
string target,
|
||||
TrackedOperationId trackedId,
|
||||
ExternalCallResult result,
|
||||
CancellationToken cancellationToken)
|
||||
{
|
||||
if (_cachedForwarder == null)
|
||||
{
|
||||
return;
|
||||
}
|
||||
|
||||
var occurredAtUtc = DateTime.UtcNow;
|
||||
|
||||
// Status mapping mirrors the API path: success -> Delivered, a
|
||||
// synchronous permanent failure -> Failed. A transient failure never
|
||||
// reaches here (WasBuffered=true), so "the immediate attempt failed
|
||||
// and the operation is done" always means a permanent failure.
|
||||
var auditTerminalStatus = result.Success ? AuditStatus.Delivered : AuditStatus.Failed;
|
||||
var operationalTerminalStatus = result.Success ? "Delivered" : "Failed";
|
||||
|
||||
// --- Attempted row -------------------------------------------------
|
||||
CachedCallTelemetry? attempted = TryBuildDbTerminalTelemetry(
|
||||
connectionName, target, trackedId, occurredAtUtc,
|
||||
AuditKind.DbWriteCached, AuditStatus.Attempted, "Attempted",
|
||||
result, isTerminal: false);
|
||||
|
||||
if (attempted is not null)
|
||||
{
|
||||
try
|
||||
{
|
||||
await _cachedForwarder.ForwardAsync(attempted, cancellationToken)
|
||||
.ConfigureAwait(false);
|
||||
}
|
||||
catch (Exception ex)
|
||||
{
|
||||
_logger.LogWarning(ex,
|
||||
"Immediate-Attempted telemetry forward failed for Database.CachedWrite {Connection} (TrackedOperationId {Id})",
|
||||
connectionName, trackedId);
|
||||
}
|
||||
}
|
||||
|
||||
// --- CachedResolve row --------------------------------------------
|
||||
CachedCallTelemetry? resolve = TryBuildDbTerminalTelemetry(
|
||||
connectionName, target, trackedId, occurredAtUtc,
|
||||
AuditKind.CachedResolve, auditTerminalStatus, operationalTerminalStatus,
|
||||
result, isTerminal: true);
|
||||
|
||||
if (resolve is not null)
|
||||
{
|
||||
try
|
||||
{
|
||||
await _cachedForwarder.ForwardAsync(resolve, cancellationToken)
|
||||
.ConfigureAwait(false);
|
||||
}
|
||||
catch (Exception ex)
|
||||
{
|
||||
_logger.LogWarning(ex,
|
||||
"Immediate-CachedResolve telemetry forward failed for Database.CachedWrite {Connection} (TrackedOperationId {Id})",
|
||||
connectionName, trackedId);
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
/// <summary>
|
||||
/// Builds one immediate-completion <c>DbOutbound</c> telemetry packet, or
|
||||
/// returns <c>null</c> (and logs) when construction throws — so a build
|
||||
/// failure skips emission rather than aborting the script.
|
||||
/// </summary>
|
||||
private CachedCallTelemetry? TryBuildDbTerminalTelemetry(
|
||||
string connectionName,
|
||||
string target,
|
||||
TrackedOperationId trackedId,
|
||||
DateTime occurredAtUtc,
|
||||
AuditKind kind,
|
||||
AuditStatus auditStatus,
|
||||
string operationalStatus,
|
||||
ExternalCallResult result,
|
||||
bool isTerminal)
|
||||
{
|
||||
try
|
||||
{
|
||||
return new CachedCallTelemetry(
|
||||
Audit: ScadaBridgeAuditEventFactory.Create(
|
||||
channel: AuditChannel.DbOutbound,
|
||||
kind: kind,
|
||||
status: auditStatus,
|
||||
occurredAtUtc: DateTime.SpecifyKind(occurredAtUtc, DateTimeKind.Utc),
|
||||
target: target,
|
||||
correlationId: trackedId.Value,
|
||||
executionId: _executionId,
|
||||
parentExecutionId: _parentExecutionId,
|
||||
sourceSiteId: string.IsNullOrEmpty(_siteId) ? null : _siteId,
|
||||
sourceInstanceId: _instanceName,
|
||||
sourceScript: _sourceScript,
|
||||
errorMessage: result.Success ? null : result.ErrorMessage),
|
||||
Operational: new SiteCallOperational(
|
||||
TrackedOperationId: trackedId,
|
||||
Channel: "DbOutbound",
|
||||
Target: target,
|
||||
SourceSite: _siteId,
|
||||
SourceNode: _sourceNode,
|
||||
Status: operationalStatus,
|
||||
RetryCount: 0,
|
||||
LastError: result.Success ? null : result.ErrorMessage,
|
||||
HttpStatus: null,
|
||||
CreatedAtUtc: occurredAtUtc,
|
||||
UpdatedAtUtc: occurredAtUtc,
|
||||
TerminalAtUtc: isTerminal ? occurredAtUtc : null));
|
||||
}
|
||||
catch (Exception buildEx)
|
||||
{
|
||||
_logger.LogWarning(buildEx,
|
||||
"Failed to build immediate-{Kind} telemetry for Database.CachedWrite {Connection} (TrackedOperationId {Id}) — skipping emission",
|
||||
kind, connectionName, trackedId);
|
||||
return null;
|
||||
}
|
||||
}
|
||||
|
||||
private async Task EmitCachedDbSubmitTelemetryAsync(
|
||||
string connectionName,
|
||||
TrackedOperationId trackedId,
|
||||
|
||||
@@ -42,6 +42,13 @@ public class DiffService
|
||||
s => s.CanonicalName,
|
||||
ScriptsEqual);
|
||||
|
||||
// TemplateEngine-018: surface standalone connection endpoint/protocol/
|
||||
// failover drift. Per-attribute binding changes already show up under
|
||||
// AttributeChanges, but a connection's own ConfigurationJson /
|
||||
// BackupConfigurationJson / Protocol / FailoverRetryCount edits do not —
|
||||
// those only appear here.
|
||||
var connectionChanges = ComputeConnectionsDiff(oldConfig, newConfig);
|
||||
|
||||
return new ConfigurationDiff
|
||||
{
|
||||
InstanceUniqueName = newConfig.InstanceUniqueName,
|
||||
@@ -49,7 +56,8 @@ public class DiffService
|
||||
NewRevisionHash = newRevisionHash,
|
||||
AttributeChanges = attributeChanges,
|
||||
AlarmChanges = alarmChanges,
|
||||
ScriptChanges = scriptChanges
|
||||
ScriptChanges = scriptChanges,
|
||||
ConnectionChanges = connectionChanges
|
||||
};
|
||||
}
|
||||
|
||||
@@ -133,7 +141,8 @@ public class DiffService
|
||||
a.TriggerConfiguration == b.TriggerConfiguration &&
|
||||
a.ParameterDefinitions == b.ParameterDefinitions &&
|
||||
a.ReturnDefinition == b.ReturnDefinition &&
|
||||
a.MinTimeBetweenRuns == b.MinTimeBetweenRuns;
|
||||
a.MinTimeBetweenRuns == b.MinTimeBetweenRuns &&
|
||||
a.ExecutionTimeoutSeconds == b.ExecutionTimeoutSeconds;
|
||||
|
||||
/// <summary>
|
||||
/// Compares two <see cref="ConnectionConfig"/> instances for equality across
|
||||
@@ -159,11 +168,10 @@ public class DiffService
|
||||
/// TemplateEngine-018: produces a per-connection diff between two flattened
|
||||
/// configurations, emitting Added / Removed / Changed entries keyed by the
|
||||
/// connection name. Mirrors the existing <see cref="ComputeEntityDiff{T}"/>
|
||||
/// shape used for attributes / alarms / scripts but is exposed as a separate
|
||||
/// method because <see cref="ConfigurationDiff"/> in
|
||||
/// <c>ZB.MOM.WW.ScadaBridge.Commons</c> does not yet carry a <c>ConnectionChanges</c>
|
||||
/// slot — the public diff record will be extended in a paired Commons change
|
||||
/// (this file is the only one in this fix's scope). A null
|
||||
/// shape used for attributes / alarms / scripts. Called by
|
||||
/// <see cref="ComputeDiff"/> to populate
|
||||
/// <see cref="ConfigurationDiff.ConnectionChanges"/>, and exposed publicly so
|
||||
/// callers can compute connection drift in isolation. A null
|
||||
/// <c>Connections</c> dictionary on either side is treated as the empty map.
|
||||
/// </summary>
|
||||
/// <param name="oldConfig">The previously deployed configuration, or null
|
||||
|
||||
@@ -830,6 +830,10 @@ public class FlatteningService
|
||||
ParameterDefinitions = script.ParameterDefinitions,
|
||||
ReturnDefinition = script.ReturnDefinition,
|
||||
MinTimeBetweenRuns = script.MinTimeBetweenRuns,
|
||||
// M2.5 (#9): per-script timeout rides along on the winning row.
|
||||
// Scripts inherit/override at whole-row granularity (no per-field
|
||||
// merge), so this follows the same rule as the script body/MinTime.
|
||||
ExecutionTimeoutSeconds = script.ExecutionTimeoutSeconds,
|
||||
Source = source
|
||||
};
|
||||
idByName[script.Name] = script.Id;
|
||||
|
||||
@@ -83,7 +83,10 @@ public class RevisionHashService
|
||||
TriggerConfiguration = s.TriggerConfiguration,
|
||||
ParameterDefinitions = s.ParameterDefinitions,
|
||||
ReturnDefinition = s.ReturnDefinition,
|
||||
MinTimeBetweenRunsTicks = s.MinTimeBetweenRuns?.Ticks
|
||||
MinTimeBetweenRunsTicks = s.MinTimeBetweenRuns?.Ticks,
|
||||
// M2.5 (#9): include the per-script timeout so a change to it
|
||||
// is detected as a configuration change (staleness/redeploy).
|
||||
ExecutionTimeoutSeconds = s.ExecutionTimeoutSeconds
|
||||
})
|
||||
.ToList(),
|
||||
Connections = configuration.Connections is { Count: > 0 }
|
||||
@@ -244,6 +247,10 @@ public class RevisionHashService
|
||||
/// </summary>
|
||||
public string Code { get; init; } = string.Empty;
|
||||
/// <summary>
|
||||
/// M2.5 (#9): the per-script execution timeout in seconds (null = global).
|
||||
/// </summary>
|
||||
public int? ExecutionTimeoutSeconds { get; init; }
|
||||
/// <summary>
|
||||
/// Whether the script is locked.
|
||||
/// </summary>
|
||||
public bool IsLocked { get; init; }
|
||||
|
||||
@@ -17,7 +17,7 @@ namespace ZB.MOM.WW.ScadaBridge.TemplateEngine;
|
||||
/// Override granularity:
|
||||
/// - Attributes: Value and Description overridable; DataType and DataSourceReference fixed.
|
||||
/// - Alarms: Priority, TriggerConfiguration, Description, OnTriggerScript overridable; Name and TriggerType fixed.
|
||||
/// - Scripts: Code, TriggerConfiguration, MinTimeBetweenRuns, params/return overridable; Name fixed.
|
||||
/// - Scripts: Code, TriggerConfiguration, MinTimeBetweenRuns, ExecutionTimeoutSeconds, params/return overridable; Name fixed.
|
||||
/// - Lock flag applies to the entire member (attribute/alarm/script).
|
||||
/// </summary>
|
||||
public static class LockEnforcer
|
||||
|
||||
@@ -687,6 +687,8 @@ public class TemplateService
|
||||
existing.TriggerType = proposed.TriggerType;
|
||||
existing.TriggerConfiguration = proposed.TriggerConfiguration;
|
||||
existing.MinTimeBetweenRuns = proposed.MinTimeBetweenRuns;
|
||||
// M2.5 (#9): per-script execution timeout is an overridable field.
|
||||
existing.ExecutionTimeoutSeconds = proposed.ExecutionTimeoutSeconds;
|
||||
existing.ParameterDefinitions = proposed.ParameterDefinitions;
|
||||
existing.ReturnDefinition = proposed.ReturnDefinition;
|
||||
existing.IsLocked = proposed.IsLocked;
|
||||
@@ -1013,6 +1015,7 @@ public class TemplateService
|
||||
ParameterDefinitions = script.ParameterDefinitions,
|
||||
ReturnDefinition = script.ReturnDefinition,
|
||||
MinTimeBetweenRuns = script.MinTimeBetweenRuns,
|
||||
ExecutionTimeoutSeconds = script.ExecutionTimeoutSeconds,
|
||||
IsInherited = true,
|
||||
LockedInDerived = false,
|
||||
});
|
||||
|
||||
@@ -80,6 +80,7 @@ public class SemanticValidator
|
||||
else
|
||||
{
|
||||
ValidateCallParameters(script.CanonicalName, call, sharedParamMap, errors);
|
||||
ValidateCallReturnType(script.CanonicalName, call, sharedReturnMap, errors);
|
||||
}
|
||||
}
|
||||
else
|
||||
@@ -94,6 +95,7 @@ public class SemanticValidator
|
||||
else
|
||||
{
|
||||
ValidateCallParameters(script.CanonicalName, call, scriptParamMap, errors);
|
||||
ValidateCallReturnType(script.CanonicalName, call, scriptReturnMap, errors);
|
||||
|
||||
// Instance scripts cannot call alarm on-trigger scripts
|
||||
if (alarmOnTriggerScripts.Contains(call.TargetName))
|
||||
@@ -262,6 +264,109 @@ public class SemanticValidator
|
||||
errors.Add(ValidationEntry.Error(ValidationCategory.ParameterMismatch,
|
||||
$"Script '{callerName}' calls '{call.TargetName}' with {call.ArgumentCount} arguments but {expectedParams.Count} are expected.",
|
||||
callerName));
|
||||
// Count mismatch already reported — positional type matching below
|
||||
// would be misaligned, so don't compound the noise.
|
||||
return;
|
||||
}
|
||||
|
||||
ValidateArgumentTypes(callerName, call, expectedParams, errors);
|
||||
}
|
||||
|
||||
/// <summary>
|
||||
/// #21 — Argument-type validation. Compares each positionally-matched call
|
||||
/// argument expression against the target's declared parameter type and
|
||||
/// flags only CLEAR cross-category mismatches.
|
||||
///
|
||||
/// Conservatism (false-positive avoidance) — a parameter is checked only
|
||||
/// when BOTH sides are confidently known:
|
||||
/// <list type="bullet">
|
||||
/// <item>Declared type must normalize to a known primitive (String, Integer,
|
||||
/// Float, Boolean). <c>Object</c>/<c>List</c>/unknown declarations accept
|
||||
/// anything — never flagged.</item>
|
||||
/// <item>Argument expression type must be inferable from a literal
|
||||
/// (string/char, integer, decimal, <c>true</c>/<c>false</c>). Variables,
|
||||
/// member access, method/await chains, <c>null</c>, casts, object/array
|
||||
/// initializers, and anything else infer to Unknown and are never flagged.</item>
|
||||
/// <item>Integer⇄Float is treated as compatible (numeric widening) — never
|
||||
/// flagged.</item>
|
||||
/// </list>
|
||||
/// </summary>
|
||||
private static void ValidateArgumentTypes(
|
||||
string callerName,
|
||||
CallTarget call,
|
||||
List<string> expectedParams,
|
||||
List<ValidationEntry> errors)
|
||||
{
|
||||
// Argument expressions are aligned 1:1 with parameters here (count was
|
||||
// verified equal by the caller). If the argument text couldn't be split
|
||||
// (e.g. it wasn't captured), skip silently.
|
||||
if (call.ArgumentExpressions.Count != expectedParams.Count)
|
||||
return;
|
||||
|
||||
for (var i = 0; i < expectedParams.Count; i++)
|
||||
{
|
||||
var declared = NormalizeType(expectedParams[i]);
|
||||
if (declared is null)
|
||||
continue; // Object/List/unknown declaration accepts anything.
|
||||
|
||||
var actual = InferLiteralType(call.ArgumentExpressions[i]);
|
||||
if (actual is null)
|
||||
continue; // Can't confidently infer the argument's type.
|
||||
|
||||
if (!IsAssignable(actual.Value, declared.Value))
|
||||
{
|
||||
errors.Add(ValidationEntry.Error(ValidationCategory.ParameterMismatch,
|
||||
$"Script '{callerName}' calls '{call.TargetName}' argument {i + 1} with type '{actual}' but parameter '{expectedParams[i]}' expects '{declared}'.",
|
||||
callerName));
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
/// <summary>
|
||||
/// #20 — Return-type validation. When a call result is assigned directly
|
||||
/// into a typed local declaration (<c>int x = CallScript(...)</c>,
|
||||
/// <c>bool b = await CallShared(...)</c>), compares the LHS declared type
|
||||
/// against the target's declared return type and flags clear mismatches.
|
||||
///
|
||||
/// Conservatism (false-positive avoidance) — flagged only when ALL hold:
|
||||
/// <list type="bullet">
|
||||
/// <item>The call result is captured by a typed local whose type is a known
|
||||
/// primitive (so <c>var</c>, <c>object</c>, <c>dynamic</c>, and untyped
|
||||
/// reuse are never flagged).</item>
|
||||
/// <item>The call is the WHOLE initializer (optionally preceded by
|
||||
/// <c>await</c>). If the result feeds an expression / method chain
|
||||
/// (e.g. <c>(int)(await CallScript(...))</c>, <c>CallScript(...).X</c>)
|
||||
/// the assigned-type is not captured and nothing is flagged.</item>
|
||||
/// <item>The target declares a known-primitive return type. Missing/Object/
|
||||
/// List/unknown returns are never flagged.</item>
|
||||
/// <item>Integer⇄Float is compatible (numeric widening) — never flagged.</item>
|
||||
/// </list>
|
||||
/// </summary>
|
||||
private static void ValidateCallReturnType(
|
||||
string callerName,
|
||||
CallTarget call,
|
||||
Dictionary<string, string?> returnMap,
|
||||
List<ValidationEntry> errors)
|
||||
{
|
||||
if (call.AssignedToType is null)
|
||||
return; // Result not captured by a typed local (var/untyped/unused).
|
||||
|
||||
var expected = NormalizeType(call.AssignedToType);
|
||||
if (expected is null)
|
||||
return; // LHS isn't a known primitive — don't guess.
|
||||
|
||||
if (!returnMap.TryGetValue(call.TargetName, out var returnDef))
|
||||
return;
|
||||
|
||||
var actual = NormalizeType(ParseReturnDefinitionType(returnDef));
|
||||
if (actual is null)
|
||||
return; // Target's return type unknown/non-primitive.
|
||||
|
||||
if (!IsAssignable(actual.Value, expected.Value))
|
||||
{
|
||||
errors.Add(ValidationEntry.Error(ValidationCategory.ReturnTypeMismatch,
|
||||
$"Script '{callerName}' assigns the '{actual}' return value of '{call.TargetName}' to a '{expected}' variable.",
|
||||
callerName));
|
||||
}
|
||||
}
|
||||
|
||||
@@ -270,12 +375,90 @@ public class SemanticValidator
|
||||
var result = new Dictionary<string, List<string>>(StringComparer.Ordinal);
|
||||
foreach (var script in scripts)
|
||||
{
|
||||
var parameters = ParseParameterDefinitions(script.ParameterDefinitions);
|
||||
// Per-parameter declared TYPE in declared order (raw type strings).
|
||||
// One entry per parameter, so the existing count check is preserved
|
||||
// while #21 also has the types it needs for positional matching.
|
||||
var parameters = ParseParameterTypes(script.ParameterDefinitions);
|
||||
result[script.CanonicalName] = parameters;
|
||||
}
|
||||
return result;
|
||||
}
|
||||
|
||||
/// <summary>
|
||||
/// Parses a parameter definitions JSON string (JSON Schema or legacy flat
|
||||
/// array) and returns the declared parameter TYPE for each parameter, in
|
||||
/// declared order. Names are not needed for positional call validation; the
|
||||
/// returned count equals the parameter count (preserving the count check).
|
||||
/// </summary>
|
||||
/// <param name="parameterDefinitionsJson">JSON Schema or legacy flat-array string; null/empty returns an empty list.</param>
|
||||
/// <returns>The per-parameter raw type strings (e.g. "Int32", "string", "List").</returns>
|
||||
internal static List<string> ParseParameterTypes(string? parameterDefinitionsJson)
|
||||
{
|
||||
if (string.IsNullOrWhiteSpace(parameterDefinitionsJson))
|
||||
return [];
|
||||
|
||||
try
|
||||
{
|
||||
using var doc = JsonDocument.Parse(parameterDefinitionsJson);
|
||||
// JSON Schema: { type:"object", properties:{ name:{ type:"integer" }, ... } }
|
||||
if (doc.RootElement.ValueKind == JsonValueKind.Object)
|
||||
{
|
||||
if (doc.RootElement.TryGetProperty("properties", out var props)
|
||||
&& props.ValueKind == JsonValueKind.Object)
|
||||
{
|
||||
return props.EnumerateObject()
|
||||
.Select(p => p.Value.ValueKind == JsonValueKind.Object
|
||||
&& p.Value.TryGetProperty("type", out var t)
|
||||
&& t.ValueKind == JsonValueKind.String
|
||||
? t.GetString() ?? "unknown"
|
||||
: "unknown")
|
||||
.ToList();
|
||||
}
|
||||
}
|
||||
// Legacy flat form: [{ name, type, required? }]
|
||||
else if (doc.RootElement.ValueKind == JsonValueKind.Array)
|
||||
{
|
||||
return doc.RootElement.EnumerateArray()
|
||||
.Select(e => e.TryGetProperty("type", out var t) ? t.GetString() ?? "unknown" : "unknown")
|
||||
.ToList();
|
||||
}
|
||||
}
|
||||
catch (JsonException)
|
||||
{
|
||||
}
|
||||
|
||||
return [];
|
||||
}
|
||||
|
||||
/// <summary>
|
||||
/// Extracts the declared return type from a ReturnDefinition JSON string
|
||||
/// (JSON Schema <c>{type:"..."}</c> or legacy <c>{type:"..."}</c>). Returns
|
||||
/// null when absent or unparseable.
|
||||
/// </summary>
|
||||
/// <param name="returnDefinitionJson">JSON return definition; null/empty returns null.</param>
|
||||
/// <returns>The raw return type string (e.g. "boolean", "Int32"), or null.</returns>
|
||||
internal static string? ParseReturnDefinitionType(string? returnDefinitionJson)
|
||||
{
|
||||
if (string.IsNullOrWhiteSpace(returnDefinitionJson))
|
||||
return null;
|
||||
|
||||
try
|
||||
{
|
||||
using var doc = JsonDocument.Parse(returnDefinitionJson);
|
||||
if (doc.RootElement.ValueKind == JsonValueKind.Object
|
||||
&& doc.RootElement.TryGetProperty("type", out var t)
|
||||
&& t.ValueKind == JsonValueKind.String)
|
||||
{
|
||||
return t.GetString();
|
||||
}
|
||||
}
|
||||
catch (JsonException)
|
||||
{
|
||||
}
|
||||
|
||||
return null;
|
||||
}
|
||||
|
||||
private static Dictionary<string, string?> BuildReturnMap(IReadOnlyList<ResolvedScript> scripts)
|
||||
{
|
||||
var result = new Dictionary<string, string?>(StringComparer.Ordinal);
|
||||
@@ -353,12 +536,22 @@ public class SemanticValidator
|
||||
var target = ExtractStringArgument(code, argsStart);
|
||||
if (target != null)
|
||||
{
|
||||
var argCount = CountArguments(code, argsStart);
|
||||
// First argument is the script name; the rest are the call's
|
||||
// positional arguments.
|
||||
var args = SplitCallArguments(code, argsStart);
|
||||
var argExpressions = args.Count > 1
|
||||
? args.GetRange(1, args.Count - 1)
|
||||
: new List<string>();
|
||||
|
||||
results.Add(new CallTarget
|
||||
{
|
||||
TargetName = target,
|
||||
IsShared = isShared,
|
||||
ArgumentCount = Math.Max(0, argCount - 1) // First arg is the name, rest are parameters
|
||||
ArgumentCount = argExpressions.Count,
|
||||
ArgumentExpressions = argExpressions,
|
||||
// #20: the declared type the result is assigned into, if the
|
||||
// call is the whole initializer of a typed local declaration.
|
||||
AssignedToType = ExtractAssignedToType(code, idx)
|
||||
});
|
||||
}
|
||||
|
||||
@@ -366,6 +559,372 @@ public class SemanticValidator
|
||||
}
|
||||
}
|
||||
|
||||
/// <summary>
|
||||
/// Splits a call's argument list (starting just after the opening paren)
|
||||
/// into top-level argument expressions, trimmed. Tracks parenthesis, brace,
|
||||
/// and bracket nesting plus string/char literals so object initializers,
|
||||
/// nested calls, collection expressions, and commas inside literals don't
|
||||
/// produce spurious splits. Element 0 is the script-name argument.
|
||||
/// </summary>
|
||||
private static List<string> SplitCallArguments(string code, int startPos)
|
||||
{
|
||||
var args = new List<string>();
|
||||
var depthParen = 1; // we start inside the call's own '('
|
||||
var depthBraceBracket = 0;
|
||||
var pos = startPos;
|
||||
var argStart = startPos;
|
||||
|
||||
while (pos < code.Length)
|
||||
{
|
||||
var c = code[pos];
|
||||
switch (c)
|
||||
{
|
||||
case '(':
|
||||
depthParen++;
|
||||
break;
|
||||
case ')':
|
||||
depthParen--;
|
||||
if (depthParen == 0)
|
||||
{
|
||||
AddArg(code, argStart, pos, args);
|
||||
return args;
|
||||
}
|
||||
break;
|
||||
case '{':
|
||||
case '[':
|
||||
depthBraceBracket++;
|
||||
break;
|
||||
case '}':
|
||||
case ']':
|
||||
if (depthBraceBracket > 0) depthBraceBracket--;
|
||||
break;
|
||||
case ',' when depthParen == 1 && depthBraceBracket == 0:
|
||||
AddArg(code, argStart, pos, args);
|
||||
argStart = pos + 1;
|
||||
break;
|
||||
case '"':
|
||||
case '\'':
|
||||
// Skip the literal body so its delimiters/commas are ignored.
|
||||
pos++;
|
||||
while (pos < code.Length && code[pos] != c)
|
||||
{
|
||||
if (code[pos] == '\\') pos++; // skip escaped char
|
||||
pos++;
|
||||
}
|
||||
break;
|
||||
case '/':
|
||||
// Skip C# line and block comments so commas inside them are ignored.
|
||||
// A `/` inside a string literal is already consumed above, so we only
|
||||
// reach here for real `/` tokens in code.
|
||||
if (pos + 1 < code.Length)
|
||||
{
|
||||
if (code[pos + 1] == '/')
|
||||
{
|
||||
// Line comment: skip to end-of-line.
|
||||
pos += 2;
|
||||
while (pos < code.Length && code[pos] != '\n') pos++;
|
||||
}
|
||||
else if (code[pos + 1] == '*')
|
||||
{
|
||||
// Block comment: skip to closing `*/`.
|
||||
pos += 2;
|
||||
while (pos + 1 < code.Length && !(code[pos] == '*' && code[pos + 1] == '/'))
|
||||
pos++;
|
||||
if (pos + 1 < code.Length) pos++; // step over the `/`
|
||||
}
|
||||
}
|
||||
break;
|
||||
}
|
||||
pos++;
|
||||
}
|
||||
|
||||
// Unterminated call (shouldn't happen for compilable code) — best effort.
|
||||
AddArg(code, argStart, code.Length, args);
|
||||
return args;
|
||||
|
||||
static void AddArg(string code, int start, int end, List<string> acc)
|
||||
{
|
||||
var text = code[start..end].Trim();
|
||||
// Only the trailing empty slice after a lone name (e.g. "foo",) is
|
||||
// dropped; an empty arg list ("foo") still yields just the name.
|
||||
if (text.Length > 0 || acc.Count == 0)
|
||||
acc.Add(text);
|
||||
}
|
||||
}
|
||||
|
||||
/// <summary>
|
||||
/// #20 inference — looks backwards from the call's start index for a typed
|
||||
/// local declaration whose initializer is exactly this call (optionally
|
||||
/// preceded by <c>await</c>). The call may be qualified by a simple receiver
|
||||
/// (<c>Instance.</c>, <c>Scripts.</c>, <c>Parent.</c>,
|
||||
/// <c>Children["x"].</c>) which is skipped. Returns the declared LHS type
|
||||
/// token, or null when the result isn't captured by a simple typed local
|
||||
/// (e.g. <c>var</c>, no assignment, reassignment to an existing variable, or
|
||||
/// the call is part of a larger expression such as a cast or longer
|
||||
/// member-access chain).
|
||||
/// </summary>
|
||||
private static string? ExtractAssignedToType(string code, int callIndex)
|
||||
{
|
||||
// Walk back over a simple dotted receiver immediately before the call —
|
||||
// e.g. the "Instance." / "Scripts." / "Children[\"x\"]." prefix on a
|
||||
// qualified call. Only identifier chars, '.', and bracketed indexers
|
||||
// (with string/identifier contents) are skipped; anything else (a ')',
|
||||
// an operator, another call's '(') means the call is embedded in a
|
||||
// larger expression and we must not infer.
|
||||
var receiverStart = SkipReceiverBackwards(code, callIndex);
|
||||
|
||||
// Walk back over whitespace immediately before the receiver/call.
|
||||
var i = receiverStart - 1;
|
||||
while (i >= 0 && char.IsWhiteSpace(code[i])) i--;
|
||||
if (i < 0) return null;
|
||||
|
||||
// The call must be the entire RHS: the char before it (after optional
|
||||
// 'await') must be '='. Anything else (')', '.', '(', operators) means
|
||||
// the result is consumed by a larger expression — don't infer.
|
||||
var beforeCall = code[..(i + 1)];
|
||||
|
||||
// Strip a trailing 'await' so "= await CallScript(...)" is handled.
|
||||
var awaitTrimmed = beforeCall.TrimEnd();
|
||||
if (awaitTrimmed.EndsWith("await", StringComparison.Ordinal)
|
||||
&& (awaitTrimmed.Length == 5 || !IsIdentifierChar(awaitTrimmed[^6])))
|
||||
{
|
||||
beforeCall = awaitTrimmed[..^5];
|
||||
}
|
||||
|
||||
beforeCall = beforeCall.TrimEnd();
|
||||
if (!beforeCall.EndsWith('=')) return null;
|
||||
// Exclude '==', '<=', '>=', '!=' etc. — comparisons, not assignment.
|
||||
if (beforeCall.Length >= 2)
|
||||
{
|
||||
var prev = beforeCall[^2];
|
||||
if (prev is '=' or '!' or '<' or '>' or '+' or '-' or '*' or '/' or '%' or '&' or '|' or '^')
|
||||
return null;
|
||||
}
|
||||
|
||||
// Now parse the "<type> <name>" declaration that precedes the '='.
|
||||
var decl = beforeCall[..^1].TrimEnd();
|
||||
|
||||
// Identifier (the variable name).
|
||||
var end = decl.Length;
|
||||
var nameEnd = end;
|
||||
while (nameEnd > 0 && IsIdentifierChar(decl[nameEnd - 1])) nameEnd--;
|
||||
if (nameEnd == end) return null; // no identifier
|
||||
var nameStart = nameEnd;
|
||||
|
||||
// Whitespace between type and name.
|
||||
var ws = nameStart;
|
||||
while (ws > 0 && char.IsWhiteSpace(decl[ws - 1])) ws--;
|
||||
if (ws == nameStart) return null; // need separating whitespace → "type name"
|
||||
|
||||
// The type token (single identifier/keyword — no generics/arrays here;
|
||||
// those normalize to unknown anyway and stay unflagged).
|
||||
var typeEnd = ws;
|
||||
var typeStart = typeEnd;
|
||||
while (typeStart > 0 && IsIdentifierChar(decl[typeStart - 1])) typeStart--;
|
||||
if (typeStart == typeEnd) return null;
|
||||
|
||||
// Guard against picking up a keyword that isn't a type in this position
|
||||
// (e.g. "return x = ..."). A real declaration's type token is preceded
|
||||
// by a statement boundary or open brace, not by another identifier.
|
||||
if (typeStart > 0)
|
||||
{
|
||||
var b = typeStart - 1;
|
||||
while (b >= 0 && char.IsWhiteSpace(decl[b])) b--;
|
||||
if (b >= 0 && IsIdentifierChar(decl[b]))
|
||||
return null; // preceded by another word → not a clean declaration
|
||||
}
|
||||
|
||||
return decl[typeStart..typeEnd];
|
||||
}
|
||||
|
||||
private static bool IsIdentifierChar(char c) => char.IsLetterOrDigit(c) || c == '_';
|
||||
|
||||
/// <summary>
|
||||
/// Given the index of a <c>CallScript</c>/<c>CallShared</c> token, walks
|
||||
/// backwards over a leading receiver expression composed only of identifier
|
||||
/// chars, '.', and bracketed indexers (<c>["x"]</c>), and returns the index
|
||||
/// where that receiver begins. If there is no '.' immediately before the
|
||||
/// token (an unqualified call) the original index is returned unchanged.
|
||||
/// Stops at the first character that can't be part of such a simple
|
||||
/// receiver, so casts/parenthesised/chained-method receivers aren't
|
||||
/// mistaken for a clean assignment target.
|
||||
/// </summary>
|
||||
private static int SkipReceiverBackwards(string code, int callIndex)
|
||||
{
|
||||
var i = callIndex - 1;
|
||||
// Optional whitespace then must be a '.' for there to be a receiver.
|
||||
while (i >= 0 && char.IsWhiteSpace(code[i])) i--;
|
||||
if (i < 0 || code[i] != '.') return callIndex;
|
||||
|
||||
var start = callIndex;
|
||||
while (i >= 0)
|
||||
{
|
||||
var c = code[i];
|
||||
if (c == '.' || IsIdentifierChar(c) || char.IsWhiteSpace(c))
|
||||
{
|
||||
start = i;
|
||||
i--;
|
||||
continue;
|
||||
}
|
||||
if (c == ']')
|
||||
{
|
||||
// Skip a single (non-nested) indexer "[ ... ]" with string or
|
||||
// identifier contents — e.g. Children["pump"].
|
||||
var j = i - 1;
|
||||
while (j >= 0 && code[j] != '[' && code[j] != '(' && code[j] != ')')
|
||||
j--;
|
||||
if (j < 0 || code[j] != '[') return start;
|
||||
start = j;
|
||||
i = j - 1;
|
||||
continue;
|
||||
}
|
||||
break;
|
||||
}
|
||||
return start;
|
||||
}
|
||||
|
||||
// ── Script-level type vocabulary (#20/#21) ──────────────────────────────
|
||||
//
|
||||
// The template scripting "type system" exposed in ParameterDefinitions /
|
||||
// ReturnDefinition is a small set: String, Integer, Float, Boolean, plus
|
||||
// Object / List (and arbitrary unrecognised names). Only the four scalar
|
||||
// primitives below are matched; everything else maps to null ("unknown"),
|
||||
// which the validators treat as "accept anything / don't flag".
|
||||
|
||||
private enum ScriptType { String, Integer, Float, Boolean }
|
||||
|
||||
/// <summary>
|
||||
/// Maps a declared type token (JSON-Schema name, legacy name, or a C# type
|
||||
/// keyword used on a call-site LHS) onto a <see cref="ScriptType"/>, or null
|
||||
/// when the type isn't one of the confidently-checkable primitives.
|
||||
/// </summary>
|
||||
private static ScriptType? NormalizeType(string? raw)
|
||||
{
|
||||
if (string.IsNullOrWhiteSpace(raw)) return null;
|
||||
return raw.Trim().ToLowerInvariant() switch
|
||||
{
|
||||
"string" or "datetime" => ScriptType.String,
|
||||
"integer" or "int" or "int32" or "int64" or "long" or "short" or "byte" => ScriptType.Integer,
|
||||
"float" or "double" or "decimal" or "number" or "single" => ScriptType.Float,
|
||||
"boolean" or "bool" => ScriptType.Boolean,
|
||||
// Object, List, array, var, dynamic, and anything else → unknown.
|
||||
_ => null,
|
||||
};
|
||||
}
|
||||
|
||||
/// <summary>
|
||||
/// Infers the <see cref="ScriptType"/> of a call-site argument expression,
|
||||
/// but ONLY for unambiguous literals. Returns null for variables, member
|
||||
/// access, method/await chains, <c>null</c>, casts, parenthesised/compound
|
||||
/// expressions, and object/array/collection initializers — those can't be
|
||||
/// statically typed here and must never be flagged.
|
||||
/// </summary>
|
||||
private static ScriptType? InferLiteralType(string expr)
|
||||
{
|
||||
expr = expr.Trim();
|
||||
if (expr.Length == 0) return null;
|
||||
|
||||
// String / char literal — but only if the WHOLE expression is the
|
||||
// literal (so "a" + x or x + "b" stays unknown).
|
||||
if ((expr[0] == '"' || expr[0] == '\'') && IsWholeStringLiteral(expr))
|
||||
return ScriptType.String;
|
||||
if (expr.StartsWith('@') && expr.Length > 1 && expr[1] == '"' && IsWholeStringLiteral(expr[1..]))
|
||||
return ScriptType.String;
|
||||
if (expr.StartsWith('$'))
|
||||
return null; // interpolated string — string-ish, but be conservative.
|
||||
|
||||
if (expr is "true" or "false")
|
||||
return ScriptType.Boolean;
|
||||
|
||||
// Numeric literal (optionally signed). Float if it has a '.', 'e'/'E'
|
||||
// exponent, or a float/double/decimal suffix; otherwise Integer.
|
||||
if (IsNumericLiteral(expr, out var isFloat))
|
||||
return isFloat ? ScriptType.Float : ScriptType.Integer;
|
||||
|
||||
return null; // Not a literal we can confidently classify.
|
||||
}
|
||||
|
||||
private static bool IsWholeStringLiteral(string expr)
|
||||
{
|
||||
if (expr.Length < 2) return false;
|
||||
var quote = expr[0];
|
||||
if (quote != '"' && quote != '\'') return false;
|
||||
var i = 1;
|
||||
while (i < expr.Length)
|
||||
{
|
||||
if (expr[i] == '\\') { i += 2; continue; }
|
||||
if (expr[i] == quote) return i == expr.Length - 1; // closing quote must be last char
|
||||
i++;
|
||||
}
|
||||
return false;
|
||||
}
|
||||
|
||||
private static bool IsNumericLiteral(string expr, out bool isFloat)
|
||||
{
|
||||
isFloat = false;
|
||||
var i = 0;
|
||||
if (expr.Length == 0) return false;
|
||||
if (expr[0] == '+' || expr[0] == '-') i++;
|
||||
|
||||
// A genuine numeric literal must start with a digit or a `.` followed by a
|
||||
// digit. Identifiers that start with `_` or a letter (e.g. `_2`, `count`)
|
||||
// are explicitly rejected here so they are inferred as Unknown, not Integer.
|
||||
if (i >= expr.Length) return false;
|
||||
var first = expr[i];
|
||||
if (first == '.')
|
||||
{
|
||||
if (i + 1 >= expr.Length || !char.IsDigit(expr[i + 1])) return false;
|
||||
}
|
||||
else if (!char.IsDigit(first))
|
||||
{
|
||||
return false; // starts with `_`, letter, or anything else → not a literal
|
||||
}
|
||||
|
||||
var sawDigit = false;
|
||||
var sawDot = false;
|
||||
var sawExp = false;
|
||||
for (; i < expr.Length; i++)
|
||||
{
|
||||
var c = expr[i];
|
||||
if (char.IsDigit(c)) { sawDigit = true; continue; }
|
||||
if (c == '_' && sawDigit) continue; // digit separator — only valid between digits
|
||||
if (c == '.' && !sawDot && !sawExp) { sawDot = true; isFloat = true; continue; }
|
||||
if ((c == 'e' || c == 'E') && !sawExp && sawDigit)
|
||||
{
|
||||
sawExp = true; isFloat = true;
|
||||
if (i + 1 < expr.Length && (expr[i + 1] == '+' || expr[i + 1] == '-')) i++;
|
||||
continue;
|
||||
}
|
||||
// Numeric suffix terminates the literal.
|
||||
if (i == expr.Length - 1 || (i == expr.Length - 2))
|
||||
{
|
||||
var suffix = expr[i..].ToLowerInvariant();
|
||||
switch (suffix)
|
||||
{
|
||||
case "f": case "d": case "m": isFloat = true; return sawDigit;
|
||||
case "l": case "u": case "ul": case "lu": return sawDigit; // integer suffixes
|
||||
}
|
||||
}
|
||||
return false; // any other char → not a plain numeric literal
|
||||
}
|
||||
return sawDigit;
|
||||
}
|
||||
|
||||
/// <summary>
|
||||
/// Whether an argument/return of <paramref name="actual"/> type is
|
||||
/// acceptable where <paramref name="expected"/> is declared. Exact match, or
|
||||
/// Integer⇄Float numeric widening. All other cross-category pairings
|
||||
/// (String↔number, String↔Boolean, Boolean↔number) are mismatches.
|
||||
/// </summary>
|
||||
private static bool IsAssignable(ScriptType actual, ScriptType expected)
|
||||
{
|
||||
if (actual == expected) return true;
|
||||
// Numeric widening / narrowing between Integer and Float is tolerated —
|
||||
// the scripting runtime coerces these and flagging them is noisy.
|
||||
return (actual == ScriptType.Integer && expected == ScriptType.Float)
|
||||
|| (actual == ScriptType.Float && expected == ScriptType.Integer);
|
||||
}
|
||||
|
||||
private static string? ExtractStringArgument(string code, int startPos)
|
||||
{
|
||||
// Skip whitespace
|
||||
@@ -387,43 +946,6 @@ public class SemanticValidator
|
||||
return code[nameStart..pos];
|
||||
}
|
||||
|
||||
private static int CountArguments(string code, int startPos)
|
||||
{
|
||||
var depth = 1;
|
||||
var count = 1; // At least one argument (the name)
|
||||
var pos = startPos;
|
||||
|
||||
while (pos < code.Length && depth > 0)
|
||||
{
|
||||
switch (code[pos])
|
||||
{
|
||||
case '(':
|
||||
depth++;
|
||||
break;
|
||||
case ')':
|
||||
depth--;
|
||||
break;
|
||||
case ',' when depth == 1:
|
||||
count++;
|
||||
break;
|
||||
case '"':
|
||||
case '\'':
|
||||
// Skip string literals
|
||||
var quote = code[pos];
|
||||
pos++;
|
||||
while (pos < code.Length && code[pos] != quote)
|
||||
{
|
||||
if (code[pos] == '\\') pos++; // Skip escaped chars
|
||||
pos++;
|
||||
}
|
||||
break;
|
||||
}
|
||||
pos++;
|
||||
}
|
||||
|
||||
return count;
|
||||
}
|
||||
|
||||
internal record CallTarget
|
||||
{
|
||||
/// <summary>Name of the script being called.</summary>
|
||||
@@ -432,5 +954,13 @@ public class SemanticValidator
|
||||
public bool IsShared { get; init; }
|
||||
/// <summary>Number of non-name arguments passed to the call.</summary>
|
||||
public int ArgumentCount { get; init; }
|
||||
/// <summary>The trimmed text of each non-name positional argument expression, in order.</summary>
|
||||
public IReadOnlyList<string> ArgumentExpressions { get; init; } = [];
|
||||
/// <summary>
|
||||
/// The declared type token the call result is assigned into, when the
|
||||
/// call is the whole initializer of a typed local declaration; otherwise
|
||||
/// null (var/untyped/unused/expression-embedded). Used by #20.
|
||||
/// </summary>
|
||||
public string? AssignedToType { get; init; }
|
||||
}
|
||||
}
|
||||
|
||||
@@ -14,7 +14,10 @@ namespace ZB.MOM.WW.ScadaBridge.TemplateEngine.Validation;
|
||||
/// 4. Alarm trigger references exist (referenced attributes must be in the flattened config)
|
||||
/// 5. Script trigger references exist (referenced attributes must be in the flattened config)
|
||||
/// 6. Expression triggers — blank check, syntax check, and attribute-reference scan
|
||||
/// 7. Connection binding completeness (all data-sourced attributes must have a binding)
|
||||
/// 7. Connection binding completeness — every data-sourced attribute must have a binding,
|
||||
/// and (on the deploy path) the bound connection must exist on the target site.
|
||||
/// Severity is context-dependent: a non-blocking Warning at template design time
|
||||
/// (bindings are set later) and a deploy-gating Error when enforced (M2.8 / #23).
|
||||
/// 8. Does NOT verify tag path resolution on devices
|
||||
/// </summary>
|
||||
public class ValidationService
|
||||
@@ -45,8 +48,44 @@ public class ValidationService
|
||||
/// </summary>
|
||||
/// <param name="configuration">The flattened configuration to validate.</param>
|
||||
/// <param name="sharedScripts">Optional list of shared scripts for validation context.</param>
|
||||
/// <param name="alarmCapableConnectionNames">
|
||||
/// Optional set of site data-connection names whose protocol resolves to an
|
||||
/// alarm-capable adapter (see
|
||||
/// <see cref="Commons.Interfaces.Protocol.AlarmCapableProtocols"/>). When supplied,
|
||||
/// the semantic validator gates every native-alarm-source binding against it.
|
||||
/// <c>null</c> skips the capability check (its absence makes the check inert).
|
||||
/// </param>
|
||||
/// <param name="enforceConnectionBindings">
|
||||
/// M2.8 (#23): controls the severity of the connection-binding-completeness check.
|
||||
/// <para>
|
||||
/// <c>false</c> (default) — template DESIGN-TIME: a data-sourced attribute that is
|
||||
/// not yet bound produces only a non-blocking <c>Warning</c>. Bindings are set later,
|
||||
/// at instance/deploy time, so an unbound data-sourced template attribute is legitimate
|
||||
/// here (see <see cref="ManagementService"/>'s ValidateTemplate path, which builds a
|
||||
/// config straight from raw template members with no bindings).
|
||||
/// </para>
|
||||
/// <para>
|
||||
/// <c>true</c> — DEPLOY path (<see cref="DeploymentManager"/>'s FlatteningPipeline):
|
||||
/// an unbound data-sourced attribute becomes a deploy-gating <c>Error</c> (IsValid false),
|
||||
/// and — when <paramref name="siteConnectionNames"/> is supplied — a binding pointing at a
|
||||
/// connection that does not exist on the target site is also an <c>Error</c>.
|
||||
/// </para>
|
||||
/// </param>
|
||||
/// <param name="siteConnectionNames">
|
||||
/// M2.8 (#23): optional set of the data-connection names that actually exist on the
|
||||
/// target site (computed by the deploy pipeline from the site's loaded connections,
|
||||
/// mirroring <paramref name="alarmCapableConnectionNames"/>). When supplied (and
|
||||
/// <paramref name="enforceConnectionBindings"/> is <c>true</c>), every bound
|
||||
/// connection is checked against this set so a binding to a phantom/stale connection
|
||||
/// is caught. <c>null</c> skips the "exists at site" half (it stays inert).
|
||||
/// </param>
|
||||
/// <returns>A merged <see cref="ValidationResult"/> aggregating all pipeline stage outcomes.</returns>
|
||||
public ValidationResult Validate(FlattenedConfiguration configuration, IReadOnlyList<ResolvedScript>? sharedScripts = null)
|
||||
public ValidationResult Validate(
|
||||
FlattenedConfiguration configuration,
|
||||
IReadOnlyList<ResolvedScript>? sharedScripts = null,
|
||||
IReadOnlySet<string>? alarmCapableConnectionNames = null,
|
||||
bool enforceConnectionBindings = false,
|
||||
IReadOnlySet<string>? siteConnectionNames = null)
|
||||
{
|
||||
ArgumentNullException.ThrowIfNull(configuration);
|
||||
|
||||
@@ -58,8 +97,8 @@ public class ValidationService
|
||||
ValidateAlarmTriggerReferences(configuration),
|
||||
ValidateScriptTriggerReferences(configuration),
|
||||
ValidateExpressionTriggers(configuration),
|
||||
ValidateConnectionBindingCompleteness(configuration),
|
||||
_semanticValidator.Validate(configuration, sharedScripts)
|
||||
ValidateConnectionBindingCompleteness(configuration, enforceConnectionBindings, siteConnectionNames),
|
||||
_semanticValidator.Validate(configuration, sharedScripts, alarmCapableConnectionNames)
|
||||
};
|
||||
|
||||
return ValidationResult.Merge(results.ToArray());
|
||||
@@ -497,21 +536,88 @@ public class ValidationService
|
||||
}
|
||||
|
||||
/// <summary>
|
||||
/// Validates that all data-sourced attributes have connection bindings.
|
||||
/// Validates connection bindings on data-sourced attributes. Only DATA-SOURCED
|
||||
/// attributes (<see cref="ResolvedAttribute.DataSourceReference"/> != <c>null</c>)
|
||||
/// require a binding; static attributes are never flagged.
|
||||
///
|
||||
/// M2.8 (#23): the severity is context-dependent (see <paramref name="enforce"/>).
|
||||
/// At template design time (<c>enforce == false</c>) an unbound data-sourced
|
||||
/// attribute is legitimate (bindings are set later) so it is only a non-blocking
|
||||
/// <c>Warning</c>. On the deploy path (<c>enforce == true</c>) an unbound
|
||||
/// data-sourced attribute is a deploy-gating <c>Error</c>, and — when
|
||||
/// <paramref name="siteConnectionNames"/> is supplied — a binding to a connection
|
||||
/// that does not exist on the target site is also an <c>Error</c>.
|
||||
/// </summary>
|
||||
/// <param name="configuration">The flattened configuration to validate.</param>
|
||||
/// <returns>A <see cref="ValidationResult"/> with warnings for each data-sourced attribute that lacks a connection binding.</returns>
|
||||
public static ValidationResult ValidateConnectionBindingCompleteness(FlattenedConfiguration configuration)
|
||||
/// <param name="enforce">
|
||||
/// <c>true</c> on the deploy path (unbound → Error + "exists at site" check);
|
||||
/// <c>false</c> at design time (unbound → Warning only). Defaults to <c>false</c>
|
||||
/// so design-time validation stays non-blocking.
|
||||
/// </param>
|
||||
/// <param name="siteConnectionNames">
|
||||
/// Optional set of data-connection names that actually exist on the target site.
|
||||
/// When non-<c>null</c> and <paramref name="enforce"/> is <c>true</c>, every bound
|
||||
/// connection name is checked against this set. <c>null</c> skips the "exists at
|
||||
/// site" check.
|
||||
/// </param>
|
||||
/// <returns>A <see cref="ValidationResult"/> with the binding findings at the appropriate severity.</returns>
|
||||
public static ValidationResult ValidateConnectionBindingCompleteness(
|
||||
FlattenedConfiguration configuration,
|
||||
bool enforce = false,
|
||||
IReadOnlySet<string>? siteConnectionNames = null)
|
||||
{
|
||||
var errors = new List<ValidationEntry>();
|
||||
var warnings = new List<ValidationEntry>();
|
||||
|
||||
foreach (var attr in configuration.Attributes)
|
||||
{
|
||||
if (attr.DataSourceReference != null && attr.BoundDataConnectionId == null)
|
||||
// Only data-sourced attributes participate in binding validation.
|
||||
if (attr.DataSourceReference == null)
|
||||
continue;
|
||||
|
||||
if (attr.BoundDataConnectionId == null)
|
||||
{
|
||||
warnings.Add(ValidationEntry.Warning(ValidationCategory.ConnectionBinding,
|
||||
$"Attribute '{attr.CanonicalName}' has a data source reference but no connection binding.",
|
||||
// Unbound data-sourced attribute. At deploy time this gates the
|
||||
// deployment; at design time the binding is set later, so it is
|
||||
// only advisory.
|
||||
//
|
||||
// NOTE: this branch fires for TWO distinct cases that are
|
||||
// indistinguishable post-flattening:
|
||||
// 1. The user genuinely never set a binding.
|
||||
// 2. The user set a binding, but FlatteningService.ApplyConnectionBindings
|
||||
// silently dropped it because the stored DataConnectionId no longer
|
||||
// resolves to any loaded site DataConnection (i.e. the connection was
|
||||
// deleted after the binding was created). In that case the flattener
|
||||
// leaves BoundDataConnectionId == null, and the attribute falls into
|
||||
// this same "unbound → Error" path.
|
||||
// The error message covers both cases; no behavioral change is needed.
|
||||
if (enforce)
|
||||
{
|
||||
errors.Add(ValidationEntry.Error(ValidationCategory.ConnectionBinding,
|
||||
$"Attribute '{attr.CanonicalName}' has a data source reference but no connection binding.",
|
||||
attr.CanonicalName));
|
||||
}
|
||||
else
|
||||
{
|
||||
warnings.Add(ValidationEntry.Warning(ValidationCategory.ConnectionBinding,
|
||||
$"Attribute '{attr.CanonicalName}' has a data source reference but no connection binding.",
|
||||
attr.CanonicalName));
|
||||
}
|
||||
// Skip the "exists at site" check below — it only applies to bound attributes.
|
||||
continue;
|
||||
}
|
||||
|
||||
// The attribute IS bound. On the deploy path, verify the bound connection
|
||||
// actually exists on the target site (resolve against the site's connection
|
||||
// set, not just name presence in the config). A binding pointing at a
|
||||
// non-existent/stale site connection is a deploy-gating Error.
|
||||
if (enforce && siteConnectionNames != null &&
|
||||
attr.BoundDataConnectionName != null &&
|
||||
!siteConnectionNames.Contains(attr.BoundDataConnectionName))
|
||||
{
|
||||
errors.Add(ValidationEntry.Error(ValidationCategory.ConnectionBinding,
|
||||
$"Attribute '{attr.CanonicalName}' is bound to data connection '{attr.BoundDataConnectionName}' " +
|
||||
"which does not exist on the target site.",
|
||||
attr.CanonicalName));
|
||||
}
|
||||
}
|
||||
|
||||
@@ -2339,6 +2339,7 @@ public sealed class BundleImporter : IBundleImporter
|
||||
ParameterDefinitions = s.ParameterDefinitions,
|
||||
ReturnDefinition = s.ReturnDefinition,
|
||||
MinTimeBetweenRuns = s.MinTimeBetweenRuns,
|
||||
ExecutionTimeoutSeconds = s.ExecutionTimeoutSeconds,
|
||||
Source = "Template",
|
||||
});
|
||||
}
|
||||
|
||||
@@ -99,7 +99,10 @@ public sealed record TemplateScriptDto(
|
||||
string? ParameterDefinitions,
|
||||
string? ReturnDefinition,
|
||||
bool IsLocked,
|
||||
TimeSpan? MinTimeBetweenRuns);
|
||||
TimeSpan? MinTimeBetweenRuns,
|
||||
// M2.5 (#9): per-script execution timeout (seconds). Additive trailing field;
|
||||
// null on bundles written before this field existed.
|
||||
int? ExecutionTimeoutSeconds = null);
|
||||
|
||||
public sealed record TemplateCompositionDto(
|
||||
string InstanceName,
|
||||
|
||||
@@ -74,7 +74,8 @@ public sealed class EntitySerializer
|
||||
ParameterDefinitions: s.ParameterDefinitions,
|
||||
ReturnDefinition: s.ReturnDefinition,
|
||||
IsLocked: s.IsLocked,
|
||||
MinTimeBetweenRuns: s.MinTimeBetweenRuns)).ToList(),
|
||||
MinTimeBetweenRuns: s.MinTimeBetweenRuns,
|
||||
ExecutionTimeoutSeconds: s.ExecutionTimeoutSeconds)).ToList(),
|
||||
Compositions: t.Compositions.Select(c => new TemplateCompositionDto(
|
||||
InstanceName: c.InstanceName,
|
||||
ComposedTemplateName: templateNameById.TryGetValue(c.ComposedTemplateId, out var cn) ? cn : string.Empty)).ToList());
|
||||
@@ -227,6 +228,7 @@ public sealed class EntitySerializer
|
||||
ReturnDefinition = s.ReturnDefinition,
|
||||
IsLocked = s.IsLocked,
|
||||
MinTimeBetweenRuns = s.MinTimeBetweenRuns,
|
||||
ExecutionTimeoutSeconds = s.ExecutionTimeoutSeconds,
|
||||
});
|
||||
}
|
||||
return t;
|
||||
|
||||
@@ -1,4 +1,5 @@
|
||||
using System.Security.Claims;
|
||||
using System.Text.Json;
|
||||
using ZB.MOM.WW.ScadaBridge.Security;
|
||||
using Bunit;
|
||||
using Microsoft.AspNetCore.Components.Authorization;
|
||||
@@ -12,7 +13,10 @@ using ZB.MOM.WW.ScadaBridge.Commons.Entities.Sites;
|
||||
using ZB.MOM.WW.ScadaBridge.Commons.Entities.Templates;
|
||||
using ZB.MOM.WW.ScadaBridge.Commons.Interfaces.Repositories;
|
||||
using ZB.MOM.WW.ScadaBridge.Commons.Interfaces.Services;
|
||||
using ZB.MOM.WW.ScadaBridge.Commons.Types;
|
||||
using ZB.MOM.WW.ScadaBridge.Commons.Types.Enums;
|
||||
using ZB.MOM.WW.ScadaBridge.Commons.Types.Flattening;
|
||||
using ZB.MOM.WW.ScadaBridge.Commons.Entities.Deployment;
|
||||
using ZB.MOM.WW.ScadaBridge.Communication;
|
||||
using ZB.MOM.WW.ScadaBridge.DeploymentManager;
|
||||
using ZB.MOM.WW.ScadaBridge.CentralUI.Components.Shared;
|
||||
@@ -292,6 +296,90 @@ public class TopologyPageTests : BunitContext
|
||||
Assert.Throws<Bunit.MissingEventHandlerException>(() => instanceLabel.DoubleClick());
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public void Diff_ConnectionEndpointChange_RendersConnectionSection()
|
||||
{
|
||||
// TemplateEngine-018 / DeploymentManager-018: a standalone connection
|
||||
// endpoint edit (no per-attribute binding change) must surface in the
|
||||
// deployment-diff modal. Before ConnectionChanges was wired through
|
||||
// ComputeDiff + the UI, this redeploy showed only the stale-hash badge
|
||||
// with no indication that the connection endpoint had moved.
|
||||
// The DiffDialog body-scroll lock + focus call out to JS interop on
|
||||
// open; loose mode no-ops the handlers we don't explicitly set up.
|
||||
JSInterop.Mode = JSRuntimeMode.Loose;
|
||||
|
||||
var areasBySite = new Dictionary<int, IReadOnlyList<Area>>
|
||||
{
|
||||
[1] = new List<Area> { new("Line-1") { Id = 10, SiteId = 1 } }
|
||||
};
|
||||
SeedRepos(
|
||||
sites: new[] { new Site("Plant-A", "plant-a") { Id = 1 } },
|
||||
instances: new[]
|
||||
{
|
||||
new Instance("Pump-001") { Id = 100, SiteId = 1, AreaId = 10, State = InstanceState.Enabled }
|
||||
},
|
||||
areasBySite: areasBySite);
|
||||
|
||||
// Deployed snapshot: connection "plc1" points at host-a.
|
||||
var deployedConfig = new FlattenedConfiguration
|
||||
{
|
||||
InstanceUniqueName = "Pump-001",
|
||||
Connections = new Dictionary<string, ConnectionConfig>
|
||||
{
|
||||
["plc1"] = new ConnectionConfig
|
||||
{
|
||||
Protocol = "OpcUa",
|
||||
ConfigurationJson = "{\"endpoint\":\"opc.tcp://host-a:4840\"}",
|
||||
FailoverRetryCount = 3,
|
||||
}
|
||||
}
|
||||
};
|
||||
_deployRepo.GetDeployedSnapshotByInstanceIdAsync(100, Arg.Any<CancellationToken>())
|
||||
.Returns(Task.FromResult<DeployedConfigSnapshot?>(
|
||||
new DeployedConfigSnapshot("dep-1", "hash-old",
|
||||
JsonSerializer.Serialize(deployedConfig))));
|
||||
|
||||
// Current template-derived config: same connection now points at host-b.
|
||||
var currentConfig = new FlattenedConfiguration
|
||||
{
|
||||
InstanceUniqueName = "Pump-001",
|
||||
Connections = new Dictionary<string, ConnectionConfig>
|
||||
{
|
||||
["plc1"] = new ConnectionConfig
|
||||
{
|
||||
Protocol = "OpcUa",
|
||||
ConfigurationJson = "{\"endpoint\":\"opc.tcp://host-b:4840\"}",
|
||||
FailoverRetryCount = 3,
|
||||
}
|
||||
}
|
||||
};
|
||||
_pipeline.FlattenAndValidateAsync(100, Arg.Any<CancellationToken>())
|
||||
.Returns(Task.FromResult(Result<FlatteningPipelineResult>.Success(
|
||||
new FlatteningPipelineResult(currentConfig, "hash-new", ValidationResult.Success()))));
|
||||
|
||||
var cut = Render<TopologyPage>();
|
||||
FindToggleForLabel(cut, "Plant-A")!.Click();
|
||||
FindToggleForLabel(cut, "Line-1")!.Click();
|
||||
|
||||
// The per-node action menu only renders after a context-menu (right
|
||||
// click) on the instance row, so open it first, then click "Diff".
|
||||
var instanceRow = cut.FindAll(".tv-row")
|
||||
.First(row => row.QuerySelector(".tv-label")?.TextContent == "Pump-001");
|
||||
instanceRow.ContextMenu();
|
||||
|
||||
var diffButton = cut.FindAll("button.dropdown-item")
|
||||
.First(b => b.TextContent.Trim() == "Diff");
|
||||
diffButton.Click();
|
||||
|
||||
var markup = cut.Markup;
|
||||
Assert.Contains("Connections", markup);
|
||||
Assert.Contains("plc1", markup);
|
||||
Assert.Contains("host-a", markup);
|
||||
Assert.Contains("host-b", markup);
|
||||
// The change is a modification, so the row carries the "Changed" badge.
|
||||
Assert.Contains("Changed", markup);
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public void LegacyInstancesRoute_IsDeclaredOnTopologyPage()
|
||||
{
|
||||
|
||||
+439
-10
@@ -60,6 +60,50 @@ public class DebugStreamBridgeActorTests : TestKit
|
||||
return new TestContext(actor, commProbe, mockClient, events, terminated);
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public void On_InstanceNotFound_Snapshot_Forwards_To_OnEvent_Tears_Down_Stream_And_Terminates()
|
||||
{
|
||||
// M2.11 (revised for M2.18 stream-first): the gRPC subscription is now opened
|
||||
// up-front in PreStart, so when the site reports InstanceNotFound=true the
|
||||
// bridge actor must
|
||||
// (a) forward the not-found snapshot to _onEvent so DebugStreamService's TCS
|
||||
// resolves and the caller can inspect the flag,
|
||||
// (b) tear DOWN the already-opened gRPC stream (Unsubscribe the just-opened
|
||||
// correlation) rather than enter pass-through, and
|
||||
// (c) stop itself cleanly.
|
||||
var ctx = CreateBridgeActor();
|
||||
ctx.CommProbe.ExpectMsg<SiteEnvelope>(); // initial subscribe envelope
|
||||
|
||||
// Stream-first: the gRPC subscription is opened before the snapshot arrives.
|
||||
AwaitCondition(() => ctx.MockGrpcClient.SubscribeCalls.Count == 1, TimeSpan.FromSeconds(3));
|
||||
|
||||
var notFoundSnapshot = new DebugViewSnapshot(
|
||||
InstanceName,
|
||||
new List<AttributeValueChanged>(),
|
||||
new List<AlarmStateChanged>(),
|
||||
DateTimeOffset.UtcNow,
|
||||
InstanceNotFound: true);
|
||||
|
||||
Watch(ctx.BridgeActor);
|
||||
ctx.BridgeActor.Tell(notFoundSnapshot);
|
||||
|
||||
// (a) _onEvent must receive the not-found snapshot
|
||||
AwaitCondition(() => { lock (ctx.ReceivedEvents) { return ctx.ReceivedEvents.Count == 1; } },
|
||||
TimeSpan.FromSeconds(3));
|
||||
lock (ctx.ReceivedEvents)
|
||||
{
|
||||
var received = Assert.IsType<DebugViewSnapshot>(ctx.ReceivedEvents[0]);
|
||||
Assert.True(received.InstanceNotFound);
|
||||
}
|
||||
|
||||
// (b) the just-opened gRPC stream is torn down (not left running / no pass-through)
|
||||
AwaitCondition(() => ctx.MockGrpcClient.UnsubscribedCorrelationIds.Contains("corr-1"),
|
||||
TimeSpan.FromSeconds(3));
|
||||
|
||||
// (c) actor terminates cleanly
|
||||
ExpectTerminated(ctx.BridgeActor, TimeSpan.FromSeconds(3));
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public void PreStart_Sends_SubscribeDebugViewRequest_Via_ClusterClient()
|
||||
{
|
||||
@@ -94,11 +138,18 @@ public class DebugStreamBridgeActorTests : TestKit
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public void On_Snapshot_Opens_GrpcStream()
|
||||
public void On_Snapshot_Does_Not_Open_Additional_GrpcStream()
|
||||
{
|
||||
// M2.18 stream-first: the gRPC subscription is opened in PreStart, BEFORE the
|
||||
// snapshot arrives. After the snapshot is delivered the actor switches to
|
||||
// pass-through — it must NOT open a second subscription. Exactly ONE subscribe
|
||||
// call should have been made (the PreStart one).
|
||||
var ctx = CreateBridgeActor();
|
||||
ctx.CommProbe.ExpectMsg<SiteEnvelope>();
|
||||
|
||||
// Verify the stream is already open before the snapshot.
|
||||
AwaitCondition(() => ctx.MockGrpcClient.SubscribeCalls.Count == 1, TimeSpan.FromSeconds(3));
|
||||
|
||||
var snapshot = new DebugViewSnapshot(
|
||||
InstanceName,
|
||||
new List<AttributeValueChanged>(),
|
||||
@@ -107,11 +158,12 @@ public class DebugStreamBridgeActorTests : TestKit
|
||||
|
||||
ctx.BridgeActor.Tell(snapshot);
|
||||
|
||||
AwaitCondition(() => ctx.MockGrpcClient.SubscribeCalls.Count == 1, TimeSpan.FromSeconds(3));
|
||||
|
||||
var call = ctx.MockGrpcClient.SubscribeCalls[0];
|
||||
Assert.Equal("corr-1", call.CorrelationId);
|
||||
Assert.Equal(InstanceName, call.InstanceUniqueName);
|
||||
// After snapshot delivery, still exactly ONE subscribe — no additional stream opened.
|
||||
AwaitCondition(() => { lock (ctx.ReceivedEvents) { return ctx.ReceivedEvents.Count == 1; } },
|
||||
TimeSpan.FromSeconds(3));
|
||||
var singleCall = Assert.Single(ctx.MockGrpcClient.SubscribeCalls);
|
||||
Assert.Equal("corr-1", singleCall.CorrelationId);
|
||||
Assert.Equal(InstanceName, singleCall.InstanceUniqueName);
|
||||
}
|
||||
|
||||
[Fact]
|
||||
@@ -348,6 +400,369 @@ public class DebugStreamBridgeActorTests : TestKit
|
||||
Assert.Equal("corr-1", factory.ClientFor(GrpcNodeB).SubscribeCalls[0].CorrelationId);
|
||||
}
|
||||
|
||||
// ---------------------------------------------------------------------
|
||||
// M2.18 (#26) — stream-first + replay/dedup
|
||||
// ---------------------------------------------------------------------
|
||||
|
||||
[Fact]
|
||||
public void PreStart_Opens_GrpcStream_Before_Snapshot_Arrives()
|
||||
{
|
||||
// M2.18: the gRPC subscription must be opened in PreStart (stream-first),
|
||||
// BEFORE the snapshot is delivered, so live events start flowing during the
|
||||
// snapshot-build + network-transit window. The old lifecycle opened the
|
||||
// stream only after the snapshot arrived, losing gap-window events.
|
||||
var ctx = CreateBridgeActor();
|
||||
ctx.CommProbe.ExpectMsg<SiteEnvelope>(); // initial subscribe envelope
|
||||
|
||||
// No snapshot sent yet — the stream must already be open.
|
||||
AwaitCondition(() => ctx.MockGrpcClient.SubscribeCalls.Count == 1, TimeSpan.FromSeconds(3));
|
||||
Assert.Equal("corr-1", ctx.MockGrpcClient.SubscribeCalls[0].CorrelationId);
|
||||
Assert.Equal(InstanceName, ctx.MockGrpcClient.SubscribeCalls[0].InstanceUniqueName);
|
||||
|
||||
// _onEvent must NOT have fired — buffering, not delivering.
|
||||
lock (ctx.ReceivedEvents) { Assert.Empty(ctx.ReceivedEvents); }
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public void GapWindow_Event_Buffered_Before_Snapshot_Is_Delivered_Exactly_Once_After_Snapshot()
|
||||
{
|
||||
// M2.18: an event arriving DURING the snapshot window (before the snapshot
|
||||
// is delivered) is buffered, then flushed exactly once AFTER the snapshot.
|
||||
var ctx = CreateBridgeActor();
|
||||
ctx.CommProbe.ExpectMsg<SiteEnvelope>();
|
||||
AwaitCondition(() => ctx.MockGrpcClient.SubscribeCalls.Count == 1, TimeSpan.FromSeconds(3));
|
||||
|
||||
// Live event arrives BEFORE the snapshot — its entity is NOT in the snapshot,
|
||||
// so it is a genuine gap-window event that must survive.
|
||||
var gapEvent = new AttributeValueChanged(InstanceName, "IO", "Pressure", 99.9, "Good",
|
||||
DateTimeOffset.UtcNow);
|
||||
ctx.MockGrpcClient.SubscribeCalls[0].OnEvent(gapEvent);
|
||||
|
||||
// While buffering, _onEvent has not fired.
|
||||
lock (ctx.ReceivedEvents) { Assert.Empty(ctx.ReceivedEvents); }
|
||||
|
||||
var snapshot = new DebugViewSnapshot(
|
||||
InstanceName,
|
||||
new List<AttributeValueChanged>(),
|
||||
new List<AlarmStateChanged>(),
|
||||
DateTimeOffset.UtcNow);
|
||||
ctx.BridgeActor.Tell(snapshot);
|
||||
|
||||
// snapshot then the buffered gap-window event, exactly once, in that order.
|
||||
AwaitCondition(() => { lock (ctx.ReceivedEvents) { return ctx.ReceivedEvents.Count == 2; } },
|
||||
TimeSpan.FromSeconds(3));
|
||||
lock (ctx.ReceivedEvents)
|
||||
{
|
||||
Assert.IsType<DebugViewSnapshot>(ctx.ReceivedEvents[0]);
|
||||
var flushed = Assert.IsType<AttributeValueChanged>(ctx.ReceivedEvents[1]);
|
||||
Assert.Equal("Pressure", flushed.AttributeName);
|
||||
}
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public void Buffered_Event_Already_Reflected_In_Snapshot_Is_Dropped()
|
||||
{
|
||||
// M2.18 dedup: a buffered event whose entity is in the snapshot with an equal
|
||||
// or newer snapshot timestamp (buffered.Timestamp <= snapshot.Timestamp) is
|
||||
// already reflected and must be DROPPED.
|
||||
var ctx = CreateBridgeActor();
|
||||
ctx.CommProbe.ExpectMsg<SiteEnvelope>();
|
||||
AwaitCondition(() => ctx.MockGrpcClient.SubscribeCalls.Count == 1, TimeSpan.FromSeconds(3));
|
||||
|
||||
var t0 = DateTimeOffset.UtcNow;
|
||||
|
||||
// Buffered event for "Temp" at t0.
|
||||
var buffered = new AttributeValueChanged(InstanceName, "IO", "Temp", 42.5, "Good", t0);
|
||||
ctx.MockGrpcClient.SubscribeCalls[0].OnEvent(buffered);
|
||||
|
||||
// Snapshot already contains "Temp" at the SAME timestamp t0 → buffered is a dup.
|
||||
var snapAttr = new AttributeValueChanged(InstanceName, "IO", "Temp", 42.5, "Good", t0);
|
||||
var snapshot = new DebugViewSnapshot(
|
||||
InstanceName,
|
||||
new List<AttributeValueChanged> { snapAttr },
|
||||
new List<AlarmStateChanged>(),
|
||||
t0);
|
||||
ctx.BridgeActor.Tell(snapshot);
|
||||
|
||||
// Only the snapshot is delivered; the buffered duplicate is dropped.
|
||||
AwaitCondition(() => { lock (ctx.ReceivedEvents) { return ctx.ReceivedEvents.Count == 1; } },
|
||||
TimeSpan.FromSeconds(3));
|
||||
// Give a beat to ensure no extra (dropped) event sneaks through.
|
||||
Thread.Sleep(200);
|
||||
lock (ctx.ReceivedEvents)
|
||||
{
|
||||
Assert.Single(ctx.ReceivedEvents);
|
||||
Assert.IsType<DebugViewSnapshot>(ctx.ReceivedEvents[0]);
|
||||
}
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public void Buffered_Event_Strictly_Newer_Than_Snapshot_Entity_Is_Delivered()
|
||||
{
|
||||
// M2.18 dedup: a buffered event strictly newer than the snapshot's entry for
|
||||
// the same entity (buffered.Timestamp > snapshot.Timestamp) is NOT a dup and
|
||||
// must be DELIVERED after the snapshot.
|
||||
var ctx = CreateBridgeActor();
|
||||
ctx.CommProbe.ExpectMsg<SiteEnvelope>();
|
||||
AwaitCondition(() => ctx.MockGrpcClient.SubscribeCalls.Count == 1, TimeSpan.FromSeconds(3));
|
||||
|
||||
var snapTime = DateTimeOffset.UtcNow;
|
||||
var newerTime = snapTime.AddMilliseconds(1);
|
||||
|
||||
// Buffered event for "Temp" strictly NEWER than the snapshot's "Temp".
|
||||
var buffered = new AttributeValueChanged(InstanceName, "IO", "Temp", 50.0, "Good", newerTime);
|
||||
ctx.MockGrpcClient.SubscribeCalls[0].OnEvent(buffered);
|
||||
|
||||
var snapAttr = new AttributeValueChanged(InstanceName, "IO", "Temp", 42.5, "Good", snapTime);
|
||||
var snapshot = new DebugViewSnapshot(
|
||||
InstanceName,
|
||||
new List<AttributeValueChanged> { snapAttr },
|
||||
new List<AlarmStateChanged>(),
|
||||
snapTime);
|
||||
ctx.BridgeActor.Tell(snapshot);
|
||||
|
||||
// snapshot then the strictly-newer buffered event.
|
||||
AwaitCondition(() => { lock (ctx.ReceivedEvents) { return ctx.ReceivedEvents.Count == 2; } },
|
||||
TimeSpan.FromSeconds(3));
|
||||
lock (ctx.ReceivedEvents)
|
||||
{
|
||||
Assert.IsType<DebugViewSnapshot>(ctx.ReceivedEvents[0]);
|
||||
var flushed = Assert.IsType<AttributeValueChanged>(ctx.ReceivedEvents[1]);
|
||||
Assert.Equal(50.0, flushed.Value);
|
||||
Assert.Equal(newerTime, flushed.Timestamp);
|
||||
}
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public void Buffered_Alarm_Dedup_Uses_AlarmIdentity_And_Timestamp()
|
||||
{
|
||||
// M2.18 dedup for alarms: identity = (instance, alarm name, source reference).
|
||||
// A buffered alarm older-or-equal to the snapshot's same-identity alarm is
|
||||
// dropped; a strictly-newer one is delivered.
|
||||
var ctx = CreateBridgeActor();
|
||||
ctx.CommProbe.ExpectMsg<SiteEnvelope>();
|
||||
AwaitCondition(() => ctx.MockGrpcClient.SubscribeCalls.Count == 1, TimeSpan.FromSeconds(3));
|
||||
|
||||
var t0 = DateTimeOffset.UtcNow;
|
||||
|
||||
// Buffered: "PumpFault" at t0 (dup) and "Overheat" at t0+1ms (newer, deliver).
|
||||
var dupAlarm = new AlarmStateChanged(InstanceName, "PumpFault",
|
||||
ZB.MOM.WW.ScadaBridge.Commons.Types.Enums.AlarmState.Active, 500, t0);
|
||||
var newerAlarm = new AlarmStateChanged(InstanceName, "Overheat",
|
||||
ZB.MOM.WW.ScadaBridge.Commons.Types.Enums.AlarmState.Active, 700, t0.AddMilliseconds(1));
|
||||
ctx.MockGrpcClient.SubscribeCalls[0].OnEvent(dupAlarm);
|
||||
ctx.MockGrpcClient.SubscribeCalls[0].OnEvent(newerAlarm);
|
||||
|
||||
// Snapshot contains BOTH "PumpFault" and "Overheat" at t0.
|
||||
var snapPumpFault = new AlarmStateChanged(InstanceName, "PumpFault",
|
||||
ZB.MOM.WW.ScadaBridge.Commons.Types.Enums.AlarmState.Active, 500, t0);
|
||||
var snapOverheat = new AlarmStateChanged(InstanceName, "Overheat",
|
||||
ZB.MOM.WW.ScadaBridge.Commons.Types.Enums.AlarmState.Normal, 0, t0);
|
||||
var snapshot = new DebugViewSnapshot(
|
||||
InstanceName,
|
||||
new List<AttributeValueChanged>(),
|
||||
new List<AlarmStateChanged> { snapPumpFault, snapOverheat },
|
||||
t0);
|
||||
ctx.BridgeActor.Tell(snapshot);
|
||||
|
||||
// snapshot + only the strictly-newer "Overheat" alarm (PumpFault dropped).
|
||||
AwaitCondition(() => { lock (ctx.ReceivedEvents) { return ctx.ReceivedEvents.Count == 2; } },
|
||||
TimeSpan.FromSeconds(3));
|
||||
Thread.Sleep(200);
|
||||
lock (ctx.ReceivedEvents)
|
||||
{
|
||||
Assert.Equal(2, ctx.ReceivedEvents.Count);
|
||||
Assert.IsType<DebugViewSnapshot>(ctx.ReceivedEvents[0]);
|
||||
var flushed = Assert.IsType<AlarmStateChanged>(ctx.ReceivedEvents[1]);
|
||||
Assert.Equal("Overheat", flushed.AlarmName);
|
||||
Assert.Equal(700, flushed.Priority);
|
||||
}
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public void Buffered_Events_Flushed_In_Arrival_Order()
|
||||
{
|
||||
// M2.18: ordering preserved across multiple buffered events (none are dups —
|
||||
// their entities are absent from the snapshot).
|
||||
var ctx = CreateBridgeActor();
|
||||
ctx.CommProbe.ExpectMsg<SiteEnvelope>();
|
||||
AwaitCondition(() => ctx.MockGrpcClient.SubscribeCalls.Count == 1, TimeSpan.FromSeconds(3));
|
||||
|
||||
var baseTime = DateTimeOffset.UtcNow;
|
||||
var sub = ctx.MockGrpcClient.SubscribeCalls[0];
|
||||
sub.OnEvent(new AttributeValueChanged(InstanceName, "IO", "A", 1, "Good", baseTime));
|
||||
sub.OnEvent(new AlarmStateChanged(InstanceName, "AlarmX",
|
||||
ZB.MOM.WW.ScadaBridge.Commons.Types.Enums.AlarmState.Active, 100, baseTime));
|
||||
sub.OnEvent(new AttributeValueChanged(InstanceName, "IO", "B", 2, "Good", baseTime));
|
||||
|
||||
var snapshot = new DebugViewSnapshot(
|
||||
InstanceName,
|
||||
new List<AttributeValueChanged>(),
|
||||
new List<AlarmStateChanged>(),
|
||||
baseTime);
|
||||
ctx.BridgeActor.Tell(snapshot);
|
||||
|
||||
AwaitCondition(() => { lock (ctx.ReceivedEvents) { return ctx.ReceivedEvents.Count == 4; } },
|
||||
TimeSpan.FromSeconds(3));
|
||||
lock (ctx.ReceivedEvents)
|
||||
{
|
||||
Assert.IsType<DebugViewSnapshot>(ctx.ReceivedEvents[0]);
|
||||
Assert.Equal("A", Assert.IsType<AttributeValueChanged>(ctx.ReceivedEvents[1]).AttributeName);
|
||||
Assert.Equal("AlarmX", Assert.IsType<AlarmStateChanged>(ctx.ReceivedEvents[2]).AlarmName);
|
||||
Assert.Equal("B", Assert.IsType<AttributeValueChanged>(ctx.ReceivedEvents[3]).AttributeName);
|
||||
}
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public void PassThrough_After_Flush_Delivers_Subsequent_Events_Immediately()
|
||||
{
|
||||
// M2.18: after the snapshot+flush the actor switches to pass-through — later
|
||||
// events go straight to _onEvent (no buffering, no dup).
|
||||
var ctx = CreateBridgeActor();
|
||||
ctx.CommProbe.ExpectMsg<SiteEnvelope>();
|
||||
AwaitCondition(() => ctx.MockGrpcClient.SubscribeCalls.Count == 1, TimeSpan.FromSeconds(3));
|
||||
|
||||
var snapshot = new DebugViewSnapshot(
|
||||
InstanceName,
|
||||
new List<AttributeValueChanged>(),
|
||||
new List<AlarmStateChanged>(),
|
||||
DateTimeOffset.UtcNow);
|
||||
ctx.BridgeActor.Tell(snapshot);
|
||||
|
||||
AwaitCondition(() => { lock (ctx.ReceivedEvents) { return ctx.ReceivedEvents.Count == 1; } },
|
||||
TimeSpan.FromSeconds(3));
|
||||
|
||||
// Post-snapshot event — must be delivered immediately, exactly once.
|
||||
var postEvent = new AttributeValueChanged(InstanceName, "IO", "Temp", 42.5, "Good",
|
||||
DateTimeOffset.UtcNow);
|
||||
ctx.MockGrpcClient.SubscribeCalls[0].OnEvent(postEvent);
|
||||
|
||||
AwaitCondition(() => { lock (ctx.ReceivedEvents) { return ctx.ReceivedEvents.Count == 2; } },
|
||||
TimeSpan.FromSeconds(3));
|
||||
lock (ctx.ReceivedEvents)
|
||||
{
|
||||
Assert.IsType<AttributeValueChanged>(ctx.ReceivedEvents[1]);
|
||||
}
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public void InstanceNotFound_After_StreamFirst_Tears_Down_Stream_And_Does_Not_PassThrough()
|
||||
{
|
||||
// M2.18 + M2.11: stream-first means the gRPC subscription is already open
|
||||
// when an InstanceNotFound snapshot arrives. The bridge must tear that stream
|
||||
// down (Unsubscribe the just-opened correlation), deliver the not-found
|
||||
// snapshot, NOT enter pass-through, and stop cleanly.
|
||||
var ctx = CreateBridgeActor();
|
||||
ctx.CommProbe.ExpectMsg<SiteEnvelope>();
|
||||
|
||||
// Stream opened up-front (stream-first).
|
||||
AwaitCondition(() => ctx.MockGrpcClient.SubscribeCalls.Count == 1, TimeSpan.FromSeconds(3));
|
||||
|
||||
var notFoundSnapshot = new DebugViewSnapshot(
|
||||
InstanceName,
|
||||
new List<AttributeValueChanged>(),
|
||||
new List<AlarmStateChanged>(),
|
||||
DateTimeOffset.UtcNow,
|
||||
InstanceNotFound: true);
|
||||
|
||||
Watch(ctx.BridgeActor);
|
||||
ctx.BridgeActor.Tell(notFoundSnapshot);
|
||||
|
||||
// Not-found snapshot delivered.
|
||||
AwaitCondition(() => { lock (ctx.ReceivedEvents) { return ctx.ReceivedEvents.Count == 1; } },
|
||||
TimeSpan.FromSeconds(3));
|
||||
lock (ctx.ReceivedEvents)
|
||||
{
|
||||
Assert.True(Assert.IsType<DebugViewSnapshot>(ctx.ReceivedEvents[0]).InstanceNotFound);
|
||||
}
|
||||
|
||||
// The just-opened stream must be torn down.
|
||||
AwaitCondition(() => ctx.MockGrpcClient.UnsubscribedCorrelationIds.Contains("corr-1"),
|
||||
TimeSpan.FromSeconds(3));
|
||||
|
||||
// Stops cleanly.
|
||||
ExpectTerminated(ctx.BridgeActor, TimeSpan.FromSeconds(3));
|
||||
|
||||
// No pass-through: an event arriving after the stop is not delivered.
|
||||
var late = new AttributeValueChanged(InstanceName, "IO", "Temp", 1, "Good", DateTimeOffset.UtcNow);
|
||||
ctx.MockGrpcClient.SubscribeCalls[0].OnEvent(late);
|
||||
Thread.Sleep(200);
|
||||
lock (ctx.ReceivedEvents) { Assert.Single(ctx.ReceivedEvents); }
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public void Reconnect_During_Buffering_Phase_Keeps_Buffering_Until_Snapshot()
|
||||
{
|
||||
// M2.18: a gRPC error/reconnect BEFORE the snapshot arrives must remain in the
|
||||
// buffering phase — events on the new stream are still buffered, then flushed
|
||||
// when the snapshot finally arrives.
|
||||
var ctx = CreateBridgeActor();
|
||||
ctx.CommProbe.ExpectMsg<SiteEnvelope>();
|
||||
AwaitCondition(() => ctx.MockGrpcClient.SubscribeCalls.Count == 1, TimeSpan.FromSeconds(3));
|
||||
|
||||
// Error before snapshot → reconnect (still buffering).
|
||||
ctx.MockGrpcClient.SubscribeCalls[0].OnError(new Exception("pre-snapshot blip"));
|
||||
AwaitCondition(() => ctx.MockGrpcClient.SubscribeCalls.Count == 2, TimeSpan.FromSeconds(5));
|
||||
|
||||
// Event on the reconnected stream — still buffered (snapshot not yet delivered).
|
||||
var gapEvent = new AttributeValueChanged(InstanceName, "IO", "Late", 7, "Good",
|
||||
DateTimeOffset.UtcNow);
|
||||
ctx.MockGrpcClient.SubscribeCalls[1].OnEvent(gapEvent);
|
||||
lock (ctx.ReceivedEvents) { Assert.Empty(ctx.ReceivedEvents); }
|
||||
|
||||
var snapshot = new DebugViewSnapshot(
|
||||
InstanceName,
|
||||
new List<AttributeValueChanged>(),
|
||||
new List<AlarmStateChanged>(),
|
||||
DateTimeOffset.UtcNow);
|
||||
ctx.BridgeActor.Tell(snapshot);
|
||||
|
||||
// snapshot + the event buffered across the reconnect.
|
||||
AwaitCondition(() => { lock (ctx.ReceivedEvents) { return ctx.ReceivedEvents.Count == 2; } },
|
||||
TimeSpan.FromSeconds(3));
|
||||
lock (ctx.ReceivedEvents)
|
||||
{
|
||||
Assert.IsType<DebugViewSnapshot>(ctx.ReceivedEvents[0]);
|
||||
Assert.Equal("Late", Assert.IsType<AttributeValueChanged>(ctx.ReceivedEvents[1]).AttributeName);
|
||||
}
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public void Reconnect_After_Snapshot_Resumes_PassThrough_Not_Buffering()
|
||||
{
|
||||
// M2.18: a mid-session reconnect (after the snapshot was already delivered)
|
||||
// must resume pass-through — the snapshot is a one-time thing and events on
|
||||
// the reconnected stream are delivered immediately, not re-buffered.
|
||||
var ctx = CreateBridgeActor();
|
||||
ctx.CommProbe.ExpectMsg<SiteEnvelope>();
|
||||
AwaitCondition(() => ctx.MockGrpcClient.SubscribeCalls.Count == 1, TimeSpan.FromSeconds(3));
|
||||
|
||||
var snapshot = new DebugViewSnapshot(
|
||||
InstanceName,
|
||||
new List<AttributeValueChanged>(),
|
||||
new List<AlarmStateChanged>(),
|
||||
DateTimeOffset.UtcNow);
|
||||
ctx.BridgeActor.Tell(snapshot);
|
||||
AwaitCondition(() => { lock (ctx.ReceivedEvents) { return ctx.ReceivedEvents.Count == 1; } },
|
||||
TimeSpan.FromSeconds(3));
|
||||
|
||||
// Mid-session reconnect.
|
||||
ctx.MockGrpcClient.SubscribeCalls[0].OnError(new Exception("mid-session blip"));
|
||||
AwaitCondition(() => ctx.MockGrpcClient.SubscribeCalls.Count == 2, TimeSpan.FromSeconds(5));
|
||||
|
||||
// Event on the reconnected stream — delivered immediately (pass-through).
|
||||
var postEvent = new AttributeValueChanged(InstanceName, "IO", "Temp", 9, "Good",
|
||||
DateTimeOffset.UtcNow);
|
||||
ctx.MockGrpcClient.SubscribeCalls[1].OnEvent(postEvent);
|
||||
|
||||
AwaitCondition(() => { lock (ctx.ReceivedEvents) { return ctx.ReceivedEvents.Count == 2; } },
|
||||
TimeSpan.FromSeconds(3));
|
||||
lock (ctx.ReceivedEvents)
|
||||
{
|
||||
Assert.Equal("Temp", Assert.IsType<AttributeValueChanged>(ctx.ReceivedEvents[1]).AttributeName);
|
||||
}
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public void RetryCount_RecoveredOnlyAfterStreamStaysStableForStabilityWindow()
|
||||
{
|
||||
@@ -394,11 +809,25 @@ public class DebugStreamBridgeActorTests : TestKit
|
||||
|
||||
/// <summary>
|
||||
/// Mock gRPC client that records SubscribeAsync and Unsubscribe calls.
|
||||
/// <para>
|
||||
/// <b>Thread safety:</b> <see cref="SubscribeCalls"/> and
|
||||
/// <see cref="UnsubscribedCorrelationIds"/> are written from the actor/background thread
|
||||
/// (via <see cref="SubscribeAsync"/> and <see cref="Unsubscribe"/>) and read from the test
|
||||
/// thread (via <c>AwaitCondition</c> / assertions). All access goes through a shared lock
|
||||
/// to match the <c>lock (events)</c> pattern used for <c>ctx.ReceivedEvents</c>.
|
||||
/// </para>
|
||||
/// </summary>
|
||||
internal class MockSiteStreamGrpcClient : SiteStreamGrpcClient
|
||||
{
|
||||
public List<MockSubscription> SubscribeCalls { get; } = new();
|
||||
public List<string> UnsubscribedCorrelationIds { get; } = new();
|
||||
private readonly object _lock = new();
|
||||
private readonly List<MockSubscription> _subscribeCalls = new();
|
||||
private readonly List<string> _unsubscribedCorrelationIds = new();
|
||||
|
||||
/// <summary>Returns a snapshot of subscribe calls, taken under the internal lock.</summary>
|
||||
public List<MockSubscription> SubscribeCalls { get { lock (_lock) { return _subscribeCalls.ToList(); } } }
|
||||
|
||||
/// <summary>Returns a snapshot of unsubscribed correlation IDs, taken under the internal lock.</summary>
|
||||
public List<string> UnsubscribedCorrelationIds { get { lock (_lock) { return _unsubscribedCorrelationIds.ToList(); } } }
|
||||
|
||||
private MockSiteStreamGrpcClient(bool _) : base() { }
|
||||
|
||||
@@ -414,7 +843,7 @@ internal class MockSiteStreamGrpcClient : SiteStreamGrpcClient
|
||||
CancellationToken ct)
|
||||
{
|
||||
var subscription = new MockSubscription(correlationId, instanceUniqueName, onEvent, onError, ct);
|
||||
SubscribeCalls.Add(subscription);
|
||||
lock (_lock) { _subscribeCalls.Add(subscription); }
|
||||
|
||||
// Return a task that completes when cancelled (simulates long-running stream)
|
||||
var tcs = new TaskCompletionSource();
|
||||
@@ -424,7 +853,7 @@ internal class MockSiteStreamGrpcClient : SiteStreamGrpcClient
|
||||
|
||||
public override void Unsubscribe(string correlationId)
|
||||
{
|
||||
UnsubscribedCorrelationIds.Add(correlationId);
|
||||
lock (_lock) { _unsubscribedCorrelationIds.Add(correlationId); }
|
||||
}
|
||||
}
|
||||
|
||||
|
||||
+318
@@ -0,0 +1,318 @@
|
||||
using System.Text.RegularExpressions;
|
||||
|
||||
namespace ZB.MOM.WW.ScadaBridge.ConfigurationDatabase.Tests;
|
||||
|
||||
/// <summary>
|
||||
/// Code-level guard for the AuditLog append-only invariant (task M2.10, #18).
|
||||
///
|
||||
/// The DB-role control (DENY UPDATE / DENY DELETE on dbo.AuditLog in migration
|
||||
/// 20260602174346_CollapseAuditLogToCanonical) is the runtime enforcement layer.
|
||||
/// This test is the compile-time / test-time backstop: it fails the test run if
|
||||
/// any C# source file in the ConfigurationDatabase project contains an UPDATE or
|
||||
/// DELETE statement that targets the AuditLog table.
|
||||
///
|
||||
/// <b>Matching rule (see <c>ContainsAuditLogMutation</c> for full detail)</b>
|
||||
/// A line is flagged as a violation iff it matches the DML-syntax pattern:
|
||||
/// • <c>UPDATE\s+(?:\[?dbo\]?\.)?(?:\[?AuditLog\]?)\b</c> — UPDATE targeting AuditLog
|
||||
/// • <c>DELETE\s+(?:FROM\s+)?(?:\[?dbo\]?\.)?(?:\[?AuditLog\]?)\b</c> — DELETE targeting AuditLog
|
||||
///
|
||||
/// These tight DML-syntax patterns naturally exclude false positives:
|
||||
/// - DENY UPDATE ON dbo.AuditLog … → "DENY" comes before UPDATE; the regex
|
||||
/// requires UPDATE to be immediately followed by (optional schema.) AuditLog,
|
||||
/// so "UPDATE ON" does NOT match "UPDATE AuditLog".
|
||||
/// - ALTER TABLE dbo.AuditLog SWITCH … → ALTER TABLE precedes the table name;
|
||||
/// no UPDATE/DELETE keyword present.
|
||||
/// - Comments like "// AuditLog … UPDATE …" → UPDATE is not immediately followed
|
||||
/// by AuditLog (there are intervening words).
|
||||
/// - DELETE FROM Notifications … → AuditLog not present.
|
||||
///
|
||||
/// <b>Known limitations:</b> This guard scans only raw SQL strings — EF Core methods
|
||||
/// such as <c>ExecuteDeleteAsync</c>, <c>ExecuteUpdateAsync</c>, and <c>RemoveRange</c>
|
||||
/// targeting the AuditLog entity are NOT covered and must never be introduced.
|
||||
/// Additionally, the scan is line-oriented: DML where the keyword and table name appear
|
||||
/// on separate lines is an accepted, undetected edge case.
|
||||
/// </summary>
|
||||
public class AuditLogAppendOnlyGuardTests
|
||||
{
|
||||
// ---------------------------------------------------------------------------
|
||||
// Source root location — same walk-up pattern used by ArchitecturalConstraintTests
|
||||
// in the Commons.Tests project.
|
||||
// ---------------------------------------------------------------------------
|
||||
|
||||
private static string GetConfigurationDatabaseSourceDirectory()
|
||||
{
|
||||
// Walk up from the test binary output directory until we find the
|
||||
// ConfigurationDatabase csproj (a known anchor in the repo tree).
|
||||
var dir = new DirectoryInfo(AppContext.BaseDirectory);
|
||||
while (dir != null)
|
||||
{
|
||||
var candidate = Path.Combine(
|
||||
dir.FullName,
|
||||
"src",
|
||||
"ZB.MOM.WW.ScadaBridge.ConfigurationDatabase",
|
||||
"ZB.MOM.WW.ScadaBridge.ConfigurationDatabase.csproj");
|
||||
|
||||
if (File.Exists(candidate))
|
||||
{
|
||||
return Path.GetDirectoryName(candidate)!;
|
||||
}
|
||||
|
||||
dir = dir.Parent;
|
||||
}
|
||||
|
||||
throw new InvalidOperationException(
|
||||
"Could not locate ZB.MOM.WW.ScadaBridge.ConfigurationDatabase.csproj " +
|
||||
"by walking up from the test output directory. " +
|
||||
"Ensure the test is run from inside the repo clone.");
|
||||
}
|
||||
|
||||
// ---------------------------------------------------------------------------
|
||||
// Detection helper — kept as a static method so it can be unit-tested in
|
||||
// isolation below without requiring any file I/O.
|
||||
// ---------------------------------------------------------------------------
|
||||
|
||||
/// <summary>
|
||||
/// Returns <see langword="true"/> when the supplied text (typically a single
|
||||
/// source line) contains a SQL UPDATE or DELETE DML statement that directly
|
||||
/// targets the <c>AuditLog</c> table.
|
||||
///
|
||||
/// <b>Matching rule.</b> The regex requires the DML keyword to be
|
||||
/// immediately followed (possibly via FROM) by the optional schema prefix
|
||||
/// (<c>dbo.</c> or <c>[dbo].</c>) and then the table name <c>AuditLog</c>
|
||||
/// or <c>[AuditLog]</c> as a whole word:
|
||||
/// <code>
|
||||
/// UPDATE\s+(?:\[?dbo\]?\.)?(?:\[?AuditLog\]?)\b
|
||||
/// DELETE\s+(?:FROM\s+)?(?:\[?dbo\]?\.)?(?:\[?AuditLog\]?)\b
|
||||
/// </code>
|
||||
/// This tight DML-syntax pattern naturally excludes false positives without
|
||||
/// any additional keyword checks:
|
||||
/// <list type="bullet">
|
||||
/// <item><description>
|
||||
/// <c>DENY UPDATE ON dbo.AuditLog …</c> — "UPDATE ON" is never immediately
|
||||
/// followed by AuditLog; the pattern requires UPDATE → optional schema → AuditLog.
|
||||
/// </description></item>
|
||||
/// <item><description>
|
||||
/// <c>ALTER TABLE dbo.AuditLog SWITCH …</c> — no UPDATE/DELETE keyword present.
|
||||
/// </description></item>
|
||||
/// <item><description>
|
||||
/// <c>// AuditLog is append-only; never issue an UPDATE against it.</c> —
|
||||
/// UPDATE is not followed by AuditLog here.
|
||||
/// </description></item>
|
||||
/// <item><description>
|
||||
/// <c>DELETE FROM dbo.Notifications …</c> — AuditLog not present.
|
||||
/// </description></item>
|
||||
/// </list>
|
||||
/// </summary>
|
||||
/// <param name="text">A single source line (or any string to probe).</param>
|
||||
/// <returns><see langword="true"/> if a mutation against AuditLog is detected.</returns>
|
||||
internal static bool ContainsAuditLogMutation(string text)
|
||||
{
|
||||
if (string.IsNullOrEmpty(text))
|
||||
{
|
||||
return false;
|
||||
}
|
||||
|
||||
// DML-syntax pattern: the UPDATE or DELETE keyword must be directly followed
|
||||
// (optionally via FROM) by the optional schema qualifier and then the table name.
|
||||
//
|
||||
// Schema sub-pattern : (?:\[?dbo\]?\.)?
|
||||
// matches: nothing, "dbo.", "[dbo]."
|
||||
//
|
||||
// Table sub-pattern : \[?AuditLog\]?
|
||||
// matches: "AuditLog", "[AuditLog]"
|
||||
//
|
||||
// UPDATE\s+(?:\[?dbo\]?\.)?(?:\[?AuditLog\]?)\b
|
||||
// matches: "UPDATE AuditLog", "UPDATE dbo.AuditLog",
|
||||
// "UPDATE [AuditLog]", "UPDATE [dbo].[AuditLog]"
|
||||
// does NOT match: "DENY UPDATE ON dbo.AuditLog" (UPDATE is followed by ON)
|
||||
//
|
||||
// DELETE\s+(?:FROM\s+)?(?:\[?dbo\]?\.)?(?:\[?AuditLog\]?)\b
|
||||
// matches: "DELETE FROM AuditLog", "DELETE FROM dbo.AuditLog",
|
||||
// "DELETE FROM [AuditLog]", "DELETE FROM [dbo].[AuditLog]"
|
||||
// does NOT match: "DENY DELETE ON dbo.AuditLog" (DELETE is followed by ON)
|
||||
return AuditLogMutationPattern.IsMatch(text);
|
||||
}
|
||||
|
||||
private static readonly Regex AuditLogMutationPattern = new(
|
||||
@"\bUPDATE\s+(?:\[?dbo\]?\.)?(?:\[?AuditLog\]?)\b" +
|
||||
@"|\bDELETE\s+(?:FROM\s+)?(?:\[?dbo\]?\.)?(?:\[?AuditLog\]?)\b",
|
||||
RegexOptions.IgnoreCase | RegexOptions.Compiled);
|
||||
|
||||
// ---------------------------------------------------------------------------
|
||||
// Guard test: scan every *.cs file in ConfigurationDatabase (excluding
|
||||
// Designer/Snapshot EF artefacts and the obj/ directory).
|
||||
// ---------------------------------------------------------------------------
|
||||
|
||||
[Fact]
|
||||
public void ConfigurationDatabase_ShouldNotContainAuditLogMutations()
|
||||
{
|
||||
var sourceDir = GetConfigurationDatabaseSourceDirectory();
|
||||
|
||||
// Enumerate all .cs files; exclude EF scaffolding and build output.
|
||||
var csFiles = Directory.GetFiles(sourceDir, "*.cs", SearchOption.AllDirectories)
|
||||
.Where(f => !f.Contains(Path.DirectorySeparatorChar + "obj" + Path.DirectorySeparatorChar))
|
||||
.Where(f => !f.EndsWith(".Designer.cs", StringComparison.OrdinalIgnoreCase))
|
||||
.Where(f => !f.EndsWith("ModelSnapshot.cs", StringComparison.OrdinalIgnoreCase))
|
||||
.ToList();
|
||||
|
||||
Assert.True(csFiles.Count > 0,
|
||||
$"Expected to find .cs files under {sourceDir} but found none — source directory location may be wrong.");
|
||||
|
||||
var violations = new List<string>();
|
||||
|
||||
foreach (var file in csFiles)
|
||||
{
|
||||
var content = File.ReadAllText(file);
|
||||
|
||||
// Scan line-by-line so violation messages cite the exact line number.
|
||||
var lines = content.Split('\n');
|
||||
for (var i = 0; i < lines.Length; i++)
|
||||
{
|
||||
if (ContainsAuditLogMutation(lines[i]))
|
||||
{
|
||||
var relativePath = Path.GetRelativePath(sourceDir, file);
|
||||
violations.Add($"{relativePath}:{i + 1}: {lines[i].Trim()}");
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
Assert.True(violations.Count == 0,
|
||||
"AuditLog append-only guard: found UPDATE/DELETE targeting dbo.AuditLog " +
|
||||
"in ConfigurationDatabase source. AuditLog is APPEND-ONLY (retention uses " +
|
||||
"partition-switch DDL, not row DELETE). Violation(s):\n" +
|
||||
string.Join("\n", violations));
|
||||
}
|
||||
|
||||
// ---------------------------------------------------------------------------
|
||||
// Self-verifying matcher unit tests — prove the helper does what it claims.
|
||||
// ---------------------------------------------------------------------------
|
||||
|
||||
[Fact]
|
||||
public void ContainsAuditLogMutation_ReturnsFalse_ForCleanSource()
|
||||
{
|
||||
// The guard scan over real source PASSES (no violations) — this fact is
|
||||
// already asserted by ConfigurationDatabase_ShouldNotContainAuditLogMutations.
|
||||
// Here we verify the helper directly on a representative set of CLEAN lines
|
||||
// that appear in the production source tree.
|
||||
|
||||
// INSERT is not a mutation (append-only operations are fine).
|
||||
Assert.False(ContainsAuditLogMutation(
|
||||
"INSERT INTO dbo.AuditLog (EventId, OccurredAtUtc) VALUES (@id, @ts);"));
|
||||
|
||||
// SELECT is not a mutation.
|
||||
Assert.False(ContainsAuditLogMutation(
|
||||
"SELECT COUNT(*) FROM dbo.AuditLog WHERE OccurredAtUtc >= @threshold;"));
|
||||
|
||||
// ALTER TABLE SWITCH is the retention purge — not a row-level mutation.
|
||||
Assert.False(ContainsAuditLogMutation(
|
||||
"ALTER TABLE dbo.AuditLog SWITCH PARTITION 3 TO dbo.AuditLog_Staging;"));
|
||||
|
||||
// DENY DDL from the role-grant migration — must not be flagged.
|
||||
Assert.False(ContainsAuditLogMutation(
|
||||
"DENY UPDATE ON dbo.AuditLog TO scadabridge_audit_writer;"));
|
||||
Assert.False(ContainsAuditLogMutation(
|
||||
"DENY DELETE ON dbo.AuditLog TO scadabridge_audit_writer;"));
|
||||
|
||||
// GRANT DDL — also must not be flagged.
|
||||
Assert.False(ContainsAuditLogMutation(
|
||||
"GRANT INSERT ON dbo.AuditLog TO scadabridge_audit_writer;"));
|
||||
Assert.False(ContainsAuditLogMutation(
|
||||
"GRANT SELECT ON dbo.AuditLog TO scadabridge_audit_writer;"));
|
||||
|
||||
// DELETE on a different table — AuditLog not on the same line.
|
||||
Assert.False(ContainsAuditLogMutation(
|
||||
"DELETE FROM dbo.Notifications WHERE Status = 'Delivered';"));
|
||||
|
||||
// DELETE on a different table even though AuditLog appears nearby in the
|
||||
// same line but beyond the proximity window (padded to >120 chars between).
|
||||
var longSeparator = new string(' ', 130);
|
||||
Assert.False(ContainsAuditLogMutation(
|
||||
$"DELETE FROM dbo.Notifications WHERE Id = @id;{longSeparator}-- see also AuditLog"));
|
||||
|
||||
// Comment-only mention of AuditLog with UPDATE elsewhere in a comment.
|
||||
Assert.False(ContainsAuditLogMutation(
|
||||
"// AuditLog is append-only; never issue an UPDATE against it."));
|
||||
|
||||
// TRUNCATE on the staging table (not AuditLog directly); staging name only.
|
||||
Assert.False(ContainsAuditLogMutation(
|
||||
"TRUNCATE TABLE dbo.AuditLog_Staging_abc123;"));
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public void ContainsAuditLogMutation_ReturnsTrue_ForPlantedViolations()
|
||||
{
|
||||
// Planted positive cases — the guard MUST catch these.
|
||||
|
||||
// Classic UPDATE targeting AuditLog.
|
||||
Assert.True(ContainsAuditLogMutation(
|
||||
"UPDATE AuditLog SET Status = 'Corrected' WHERE EventId = @id;"));
|
||||
|
||||
// UPDATE with schema prefix.
|
||||
Assert.True(ContainsAuditLogMutation(
|
||||
"UPDATE dbo.AuditLog SET DetailsJson = @json WHERE EventId = @id;"));
|
||||
|
||||
// DELETE FROM AuditLog.
|
||||
Assert.True(ContainsAuditLogMutation(
|
||||
"DELETE FROM AuditLog WHERE OccurredAtUtc < @threshold;"));
|
||||
|
||||
// DELETE with schema prefix.
|
||||
Assert.True(ContainsAuditLogMutation(
|
||||
"DELETE FROM dbo.AuditLog WHERE Status = 'Parked';"));
|
||||
|
||||
// Mixed case (SQL is case-insensitive in practice).
|
||||
Assert.True(ContainsAuditLogMutation(
|
||||
"update dbo.AuditLog set Actor = 'system' where Actor is null;"));
|
||||
|
||||
// AuditLog mentioned earlier in the line (e.g. in a comment prefix), with a real
|
||||
// UPDATE dbo.AuditLog DML following — the DML occurrence must still be caught.
|
||||
Assert.True(ContainsAuditLogMutation(
|
||||
"-- AuditLog: UPDATE dbo.AuditLog SET x = 1"));
|
||||
|
||||
// ---- Bracketed identifier forms (SSMS-generated SQL) ----
|
||||
|
||||
// UPDATE [dbo].[AuditLog] — bracketed schema and bracketed table.
|
||||
Assert.True(ContainsAuditLogMutation(
|
||||
"UPDATE [dbo].[AuditLog] SET DetailsJson = @json WHERE EventId = @id;"));
|
||||
|
||||
// UPDATE [AuditLog] — bracketed table, no schema prefix.
|
||||
Assert.True(ContainsAuditLogMutation(
|
||||
"UPDATE [AuditLog] SET Status = 'Corrected' WHERE EventId = @id;"));
|
||||
|
||||
// DELETE FROM [dbo].[AuditLog] — bracketed schema and bracketed table.
|
||||
Assert.True(ContainsAuditLogMutation(
|
||||
"DELETE FROM [dbo].[AuditLog] WHERE OccurredAtUtc < @threshold;"));
|
||||
|
||||
// DELETE FROM [AuditLog] — bracketed table, no schema prefix.
|
||||
Assert.True(ContainsAuditLogMutation(
|
||||
"DELETE FROM [AuditLog] WHERE OccurredAtUtc < @threshold;"));
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public void ContainsAuditLogMutation_ReturnsFalse_ForDenyGrantAndPartitionSwitchSamples()
|
||||
{
|
||||
// Extra explicit coverage for the four concrete exclusion patterns
|
||||
// that appear in the real migration files.
|
||||
|
||||
// From 20260602174346_CollapseAuditLogToCanonical.cs and 20260520142214_AddAuditLogTable.cs:
|
||||
Assert.False(ContainsAuditLogMutation(
|
||||
"DENY UPDATE ON dbo.AuditLog TO scadabridge_audit_writer;"));
|
||||
Assert.False(ContainsAuditLogMutation(
|
||||
"DENY DELETE ON dbo.AuditLog TO scadabridge_audit_writer;"));
|
||||
|
||||
// From AuditLogRepository.cs SwitchOutPartitionAsync:
|
||||
Assert.False(ContainsAuditLogMutation(
|
||||
"ALTER TABLE dbo.AuditLog SWITCH PARTITION ' + CAST(@partitionNumber AS nvarchar(10)) + ' TO dbo.[' + @stagingName + '];"));
|
||||
|
||||
// Notifications DELETE (legitimate; AuditLog not present on the line):
|
||||
Assert.False(ContainsAuditLogMutation(
|
||||
"DELETE FROM dbo.Notifications WHERE CompletedAtUtc < @cutoff;"));
|
||||
|
||||
// Notifications DELETE using bracketed identifiers — AuditLog not present:
|
||||
Assert.False(ContainsAuditLogMutation(
|
||||
"DELETE FROM [dbo].[Notifications] WHERE CompletedAtUtc < @cutoff;"));
|
||||
|
||||
// SiteCalls DELETE (legitimate; AuditLog not present on the line):
|
||||
Assert.False(ContainsAuditLogMutation(
|
||||
"DELETE FROM dbo.SiteCalls WHERE TerminalAtUtc < @cutoff;"));
|
||||
}
|
||||
}
|
||||
+28
@@ -61,6 +61,34 @@ public class TemplateEngineRepositoryTests : IDisposable
|
||||
Assert.Equal("Slot1", loaded.Compositions.First().InstanceName);
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public async Task TemplateScript_ExecutionTimeoutSeconds_RoundTripsThroughEf()
|
||||
{
|
||||
// M2.5 (#9): the nullable per-script execution timeout must persist and
|
||||
// reload through EF — both an explicit value and a null (use-global).
|
||||
var template = new Template("TimeoutTemplate");
|
||||
template.Scripts.Add(new TemplateScript("WithTimeout", "return 1;")
|
||||
{
|
||||
ExecutionTimeoutSeconds = 45
|
||||
});
|
||||
template.Scripts.Add(new TemplateScript("NoTimeout", "return 2;")); // null
|
||||
_context.Templates.Add(template);
|
||||
await _context.SaveChangesAsync();
|
||||
|
||||
// Detach so the reload comes from the store, not the change tracker.
|
||||
_context.ChangeTracker.Clear();
|
||||
|
||||
var loaded = await _context.Templates
|
||||
.Include(t => t.Scripts)
|
||||
.SingleAsync(t => t.Name == "TimeoutTemplate");
|
||||
|
||||
var withTimeout = loaded.Scripts.Single(s => s.Name == "WithTimeout");
|
||||
Assert.Equal(45, withTimeout.ExecutionTimeoutSeconds);
|
||||
|
||||
var noTimeout = loaded.Scripts.Single(s => s.Name == "NoTimeout");
|
||||
Assert.Null(noTimeout.ExecutionTimeoutSeconds);
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public async Task GetTemplateWithChildrenAsync_ReturnsNull_WhenTemplateDoesNotExist()
|
||||
{
|
||||
|
||||
+166
@@ -0,0 +1,166 @@
|
||||
using Opc.Ua;
|
||||
using ZB.MOM.WW.ScadaBridge.DataConnectionLayer;
|
||||
using ZB.MOM.WW.ScadaBridge.DataConnectionLayer.Adapters;
|
||||
|
||||
namespace ZB.MOM.WW.ScadaBridge.DataConnectionLayer.Tests.Adapters;
|
||||
|
||||
/// <summary>
|
||||
/// M2.4 (#8): the OPC UA EventFilter gains a server-side <see cref="ContentFilter"/>
|
||||
/// WhereClause as a bandwidth optimisation when a condition-type filter is present.
|
||||
/// The client-side gate in DataConnectionActor remains authoritative; these tests
|
||||
/// only pin the filter-shaping. No live server required — pure SDK object building.
|
||||
/// </summary>
|
||||
public class RealOpcUaClientAlarmFilterTests
|
||||
{
|
||||
[Fact]
|
||||
public void BuildAlarmEventFilter_NoFilter_HasNoWhereClause()
|
||||
{
|
||||
var filter = RealOpcUaClient.BuildAlarmEventFilter(AlarmConditionFilter.AllowAll);
|
||||
Assert.NotEmpty(filter.SelectClauses);
|
||||
Assert.Empty(filter.WhereClause.Elements);
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public void BuildAlarmEventFilter_WithKnownTypes_BuildsNonEmptyWhereClause()
|
||||
{
|
||||
var parsed = AlarmConditionFilter.Parse("LimitAlarmType,DiscreteAlarmType");
|
||||
var filter = RealOpcUaClient.BuildAlarmEventFilter(parsed);
|
||||
|
||||
Assert.NotEmpty(filter.WhereClause.Elements);
|
||||
// Two known types → two OfType operands (OR'd when more than one).
|
||||
var ofTypeCount = filter.WhereClause.Elements.Count(e => e.FilterOperator == FilterOperator.OfType);
|
||||
Assert.Equal(2, ofTypeCount);
|
||||
Assert.Contains(filter.WhereClause.Elements, e => e.FilterOperator == FilterOperator.Or);
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public void BuildAlarmEventFilter_SingleKnownType_BuildsSingleOfType_NoOr()
|
||||
{
|
||||
var parsed = AlarmConditionFilter.Parse("AlarmConditionType");
|
||||
var filter = RealOpcUaClient.BuildAlarmEventFilter(parsed);
|
||||
|
||||
Assert.Single(filter.WhereClause.Elements);
|
||||
Assert.Equal(FilterOperator.OfType, filter.WhereClause.Elements[0].FilterOperator);
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public void BuildAlarmEventFilter_TypeMatchingIsCaseInsensitive()
|
||||
{
|
||||
var parsed = AlarmConditionFilter.Parse("limitalarmtype");
|
||||
var filter = RealOpcUaClient.BuildAlarmEventFilter(parsed);
|
||||
Assert.Single(filter.WhereClause.Elements, e => e.FilterOperator == FilterOperator.OfType);
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public void BuildAlarmEventFilter_AllUnknownTypes_OmitsWhereClause()
|
||||
{
|
||||
// Custom/vendor type names we cannot map to standard NodeIds are skipped
|
||||
// server-side; the client-side gate still enforces them. Omitting the
|
||||
// WhereClause is the safe choice — a partial WhereClause would drop the
|
||||
// unmapped types at the server and break correctness.
|
||||
var parsed = AlarmConditionFilter.Parse("MyVendorCustomAlarm,AnotherCustomThing");
|
||||
var filter = RealOpcUaClient.BuildAlarmEventFilter(parsed);
|
||||
Assert.Empty(filter.WhereClause.Elements);
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public void BuildAlarmEventFilter_MixedKnownAndUnknown_OmitsWhereClause()
|
||||
{
|
||||
// If ANY requested type can't be mapped, a server-side WhereClause would
|
||||
// silently drop that type's events — so we omit the optimisation entirely
|
||||
// and let the (authoritative) client gate do the filtering.
|
||||
var parsed = AlarmConditionFilter.Parse("LimitAlarmType,MyVendorCustomAlarm");
|
||||
var filter = RealOpcUaClient.BuildAlarmEventFilter(parsed);
|
||||
Assert.Empty(filter.WhereClause.Elements);
|
||||
}
|
||||
|
||||
// ── SelectClause index alignment (M2.13 / #27) ───────────────────────────
|
||||
// CRITICAL: HandleAlarmEvent reads fields[N] by position. Verify new clauses
|
||||
// are APPENDED at indices 13–17 so existing mappings (0–12) are undisturbed.
|
||||
|
||||
[Fact]
|
||||
public void BuildAlarmEventFilter_HasExactly18SelectClauses()
|
||||
{
|
||||
// Baseline: 6 base fields + 7 A&C sub-state fields + 5 new appended fields = 18.
|
||||
// If this count changes, review HandleAlarmEvent index mappings immediately.
|
||||
var filter = RealOpcUaClient.BuildAlarmEventFilter(AlarmConditionFilter.AllowAll);
|
||||
Assert.Equal(18, filter.SelectClauses.Count);
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public void BuildAlarmEventFilter_Index13_IsAlarmConditionType_ActiveState_TransitionTime()
|
||||
{
|
||||
// Index 13 must be AlarmConditionType/ActiveState/TransitionTime → OriginalRaiseTime.
|
||||
var filter = RealOpcUaClient.BuildAlarmEventFilter(AlarmConditionFilter.AllowAll);
|
||||
var clause = filter.SelectClauses[13];
|
||||
Assert.Equal(ObjectTypeIds.AlarmConditionType, clause.TypeDefinitionId);
|
||||
Assert.Equal(2, clause.BrowsePath.Count);
|
||||
Assert.Equal("ActiveState", clause.BrowsePath[0].Name);
|
||||
Assert.Equal("TransitionTime", clause.BrowsePath[1].Name);
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public void BuildAlarmEventFilter_Index14_IsLimitAlarmType_HighHighLimit()
|
||||
{
|
||||
var filter = RealOpcUaClient.BuildAlarmEventFilter(AlarmConditionFilter.AllowAll);
|
||||
var clause = filter.SelectClauses[14];
|
||||
Assert.Equal(ObjectTypeIds.LimitAlarmType, clause.TypeDefinitionId);
|
||||
Assert.Equal("HighHighLimit", clause.BrowsePath[0].Name);
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public void BuildAlarmEventFilter_Index15_IsLimitAlarmType_HighLimit()
|
||||
{
|
||||
var filter = RealOpcUaClient.BuildAlarmEventFilter(AlarmConditionFilter.AllowAll);
|
||||
var clause = filter.SelectClauses[15];
|
||||
Assert.Equal(ObjectTypeIds.LimitAlarmType, clause.TypeDefinitionId);
|
||||
Assert.Equal("HighLimit", clause.BrowsePath[0].Name);
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public void BuildAlarmEventFilter_Index16_IsLimitAlarmType_LowLimit()
|
||||
{
|
||||
var filter = RealOpcUaClient.BuildAlarmEventFilter(AlarmConditionFilter.AllowAll);
|
||||
var clause = filter.SelectClauses[16];
|
||||
Assert.Equal(ObjectTypeIds.LimitAlarmType, clause.TypeDefinitionId);
|
||||
Assert.Equal("LowLimit", clause.BrowsePath[0].Name);
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public void BuildAlarmEventFilter_Index17_IsLimitAlarmType_LowLowLimit()
|
||||
{
|
||||
var filter = RealOpcUaClient.BuildAlarmEventFilter(AlarmConditionFilter.AllowAll);
|
||||
var clause = filter.SelectClauses[17];
|
||||
Assert.Equal(ObjectTypeIds.LimitAlarmType, clause.TypeDefinitionId);
|
||||
Assert.Equal("LowLowLimit", clause.BrowsePath[0].Name);
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public void BuildAlarmEventFilter_ExistingIndices0To12_Unchanged()
|
||||
{
|
||||
// Guard: the first 13 SelectClauses (indices 0–12) must remain unchanged so
|
||||
// that existing HandleAlarmEvent logic is not silently broken by future edits.
|
||||
var filter = RealOpcUaClient.BuildAlarmEventFilter(AlarmConditionFilter.AllowAll);
|
||||
|
||||
// Indices 0–5: base event fields (EventType…Severity) from BaseEventType.
|
||||
for (var i = 0; i <= 5; i++)
|
||||
Assert.Equal(ObjectTypeIds.BaseEventType, filter.SelectClauses[i].TypeDefinitionId);
|
||||
|
||||
// Index 6: AlarmConditionType/ActiveState/Id
|
||||
Assert.Equal(ObjectTypeIds.AlarmConditionType, filter.SelectClauses[6].TypeDefinitionId);
|
||||
Assert.Equal("ActiveState", filter.SelectClauses[6].BrowsePath[0].Name);
|
||||
Assert.Equal("Id", filter.SelectClauses[6].BrowsePath[1].Name);
|
||||
|
||||
// Index 7: AcknowledgeableConditionType/AckedState/Id
|
||||
Assert.Equal(ObjectTypeIds.AcknowledgeableConditionType, filter.SelectClauses[7].TypeDefinitionId);
|
||||
Assert.Equal("AckedState", filter.SelectClauses[7].BrowsePath[0].Name);
|
||||
|
||||
// Index 11: ConditionType/ConditionName
|
||||
Assert.Equal(ObjectTypeIds.ConditionType, filter.SelectClauses[11].TypeDefinitionId);
|
||||
Assert.Equal("ConditionName", filter.SelectClauses[11].BrowsePath[0].Name);
|
||||
|
||||
// Index 12: ConditionType/Comment
|
||||
Assert.Equal(ObjectTypeIds.ConditionType, filter.SelectClauses[12].TypeDefinitionId);
|
||||
Assert.Equal("Comment", filter.SelectClauses[12].BrowsePath[0].Name);
|
||||
}
|
||||
}
|
||||
@@ -0,0 +1,113 @@
|
||||
using ZB.MOM.WW.ScadaBridge.Commons.Types.Alarms;
|
||||
using ZB.MOM.WW.ScadaBridge.Commons.Types.Enums;
|
||||
using ZB.MOM.WW.ScadaBridge.DataConnectionLayer;
|
||||
|
||||
namespace ZB.MOM.WW.ScadaBridge.DataConnectionLayer.Tests;
|
||||
|
||||
/// <summary>
|
||||
/// M2.4 (#8): the alarm conditionFilter is a comma-separated, case-insensitive
|
||||
/// list of condition type names. Blank = allow all. These tests pin the
|
||||
/// parse-once / IsAllowed predicate that the DataConnectionActor uses as the
|
||||
/// authoritative client-side gate.
|
||||
/// </summary>
|
||||
public class AlarmConditionFilterTests
|
||||
{
|
||||
private static NativeAlarmTransition Tx(string typeName,
|
||||
AlarmTransitionKind kind = AlarmTransitionKind.Raise) =>
|
||||
new("ref", "obj", typeName, kind,
|
||||
new AlarmConditionState(true, false, null, AlarmShelveState.Unshelved, false, 500),
|
||||
"cat", "desc", "msg", "", "", null, DateTimeOffset.UtcNow, "1", "0");
|
||||
|
||||
[Theory]
|
||||
[InlineData(null)]
|
||||
[InlineData("")]
|
||||
[InlineData(" ")]
|
||||
[InlineData(",")]
|
||||
[InlineData(" , , ")]
|
||||
public void NullOrBlankFilter_IsEmpty_AllowsEverything(string? filter)
|
||||
{
|
||||
var f = AlarmConditionFilter.Parse(filter);
|
||||
Assert.True(f.IsEmpty);
|
||||
Assert.True(f.IsAllowed(Tx("AnalogLimit.Hi")));
|
||||
Assert.True(f.IsAllowed(Tx("anything-at-all")));
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public void Parse_SplitsCommaSeparatedList()
|
||||
{
|
||||
var f = AlarmConditionFilter.Parse("AnalogLimit.Hi,DiscreteAlarm,AnalogLimit.Lo");
|
||||
Assert.False(f.IsEmpty);
|
||||
Assert.True(f.IsAllowed(Tx("AnalogLimit.Hi")));
|
||||
Assert.True(f.IsAllowed(Tx("DiscreteAlarm")));
|
||||
Assert.True(f.IsAllowed(Tx("AnalogLimit.Lo")));
|
||||
Assert.False(f.IsAllowed(Tx("AnalogLimit.HiHi")));
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public void IsAllowed_IsCaseInsensitive()
|
||||
{
|
||||
var f = AlarmConditionFilter.Parse("AnalogLimit.Hi");
|
||||
Assert.True(f.IsAllowed(Tx("analoglimit.hi")));
|
||||
Assert.True(f.IsAllowed(Tx("ANALOGLIMIT.HI")));
|
||||
Assert.False(f.IsAllowed(Tx("DiscreteAlarm")));
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public void Parse_TrimsWhitespaceAroundEachName()
|
||||
{
|
||||
var f = AlarmConditionFilter.Parse(" AnalogLimit.Hi ,\tDiscreteAlarm ");
|
||||
Assert.True(f.IsAllowed(Tx("AnalogLimit.Hi")));
|
||||
Assert.True(f.IsAllowed(Tx("DiscreteAlarm")));
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public void Parse_DropsEmptyEntries_KeepsNonEmpty()
|
||||
{
|
||||
var f = AlarmConditionFilter.Parse("AnalogLimit.Hi,, ,DiscreteAlarm");
|
||||
Assert.False(f.IsEmpty);
|
||||
Assert.True(f.IsAllowed(Tx("AnalogLimit.Hi")));
|
||||
Assert.True(f.IsAllowed(Tx("DiscreteAlarm")));
|
||||
Assert.False(f.IsAllowed(Tx("")));
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public void IsAllowed_NeverDropsSnapshotCompleteFramingSentinel()
|
||||
{
|
||||
// SnapshotComplete is a pure framing sentinel (empty AlarmTypeName) that
|
||||
// drives the NativeAlarmActor's atomic snapshot swap. A type filter must
|
||||
// never swallow it or the snapshot replay never completes.
|
||||
var f = AlarmConditionFilter.Parse("AnalogLimit.Hi");
|
||||
Assert.True(f.IsAllowed(Tx("", AlarmTransitionKind.SnapshotComplete)));
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public void IsAllowed_FiltersReplayedSnapshotConditionsByType()
|
||||
{
|
||||
// Snapshot-kind transitions carry real conditions and ARE filtered.
|
||||
var f = AlarmConditionFilter.Parse("AnalogLimit.Hi");
|
||||
Assert.True(f.IsAllowed(Tx("AnalogLimit.Hi", AlarmTransitionKind.Snapshot)));
|
||||
Assert.False(f.IsAllowed(Tx("DiscreteAlarm", AlarmTransitionKind.Snapshot)));
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public void Names_ExposesNormalizedSet_ForServerSideOptimization()
|
||||
{
|
||||
var f = AlarmConditionFilter.Parse(" AnalogLimit.Hi , DiscreteAlarm ");
|
||||
Assert.Equal(new[] { "AnalogLimit.Hi", "DiscreteAlarm" }, f.Names.OrderBy(n => n).ToArray());
|
||||
Assert.Empty(AlarmConditionFilter.Parse(null).Names);
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public void IsAllowed_OpcUaResolvedFriendlyName_MatchesFriendlyNameFilter()
|
||||
{
|
||||
// M2.4 (#8) regression: OPC UA delivers events whose AlarmTypeName, after
|
||||
// RealOpcUaClient.ResolveAlarmTypeName, is a standard friendly type name
|
||||
// (e.g. "ExclusiveLevelAlarmType"). A friendly-name filter on that source
|
||||
// built a correct server WhereClause; the client gate must agree and deliver,
|
||||
// not drop every event (which the prior NodeId-string AlarmTypeName caused).
|
||||
var f = AlarmConditionFilter.Parse("ExclusiveLevelAlarmType,DiscreteAlarmType");
|
||||
Assert.True(f.IsAllowed(Tx("ExclusiveLevelAlarmType")));
|
||||
Assert.True(f.IsAllowed(Tx("DiscreteAlarmType")));
|
||||
Assert.False(f.IsAllowed(Tx("OffNormalAlarmType")));
|
||||
}
|
||||
}
|
||||
+133
-1
@@ -23,10 +23,27 @@ public class DataConnectionActorAlarmTests : TestKit
|
||||
};
|
||||
|
||||
private static NativeAlarmTransition Raise(string sourceRef, string sourceObj) =>
|
||||
new(sourceRef, sourceObj, "AnalogLimit.Hi", AlarmTransitionKind.Raise,
|
||||
Raise(sourceRef, sourceObj, "AnalogLimit.Hi");
|
||||
|
||||
private static NativeAlarmTransition Raise(string sourceRef, string sourceObj, string typeName,
|
||||
AlarmTransitionKind kind = AlarmTransitionKind.Raise) =>
|
||||
new(sourceRef, sourceObj, typeName, kind,
|
||||
new AlarmConditionState(true, false, null, AlarmShelveState.Unshelved, false, 500),
|
||||
"Process", "hi", "hi", "", "", null, DateTimeOffset.UtcNow, "92", "90");
|
||||
|
||||
private static (IDataConnection Adapter, Func<AlarmTransitionCallback?> Cb) BuildAlarmAdapter()
|
||||
{
|
||||
AlarmTransitionCallback? cb = null;
|
||||
var adapter = Substitute.For<IDataConnection, IAlarmSubscribableConnection>();
|
||||
adapter.ConnectAsync(Arg.Any<IDictionary<string, string>>(), Arg.Any<CancellationToken>())
|
||||
.Returns(Task.CompletedTask);
|
||||
((IAlarmSubscribableConnection)adapter)
|
||||
.SubscribeAlarmsAsync(Arg.Any<string>(), Arg.Any<string?>(),
|
||||
Arg.Do<AlarmTransitionCallback>(c => cb = c), Arg.Any<CancellationToken>())
|
||||
.Returns(Task.FromResult("alarm-sub-1"));
|
||||
return (adapter, () => cb);
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public void SubscribeAlarms_RoutesTransitionToInstanceSubscriber()
|
||||
{
|
||||
@@ -63,4 +80,119 @@ public class DataConnectionActorAlarmTests : TestKit
|
||||
actor.Tell(new SubscribeAlarmsRequest("c", "inst", "conn", "Tank01", null, DateTimeOffset.UtcNow));
|
||||
ExpectMsg<SubscribeAlarmsResponse>(m => !m.Success && m.ErrorMessage != null);
|
||||
}
|
||||
|
||||
// ── M2.4 (#8): conditionFilter is now applied client-side in the actor ──
|
||||
|
||||
[Fact]
|
||||
public void SubscribeAlarms_WithTypeFilter_DeliversOnlyMatchingTypes()
|
||||
{
|
||||
var (adapter, getCb) = BuildAlarmAdapter();
|
||||
var actor = Sys.ActorOf(Props.Create(() => new DataConnectionActor(
|
||||
"conn", adapter, _options, _health, _factory, "OpcUa")));
|
||||
|
||||
actor.Tell(new SubscribeAlarmsRequest("c", "inst", "conn", "Tank01",
|
||||
"AnalogLimit.Hi,AnalogLimit.Lo", DateTimeOffset.UtcNow));
|
||||
ExpectMsg<SubscribeAlarmsResponse>(m => m.Success);
|
||||
var cb = getCb();
|
||||
Assert.NotNull(cb);
|
||||
|
||||
// Non-matching type is dropped (no message delivered).
|
||||
cb!(Raise("Tank01.HiHi", "Tank01", "AnalogLimit.HiHi"));
|
||||
ExpectNoMsg(TimeSpan.FromMilliseconds(250));
|
||||
|
||||
// Matching type is delivered.
|
||||
cb!(Raise("Tank01.Hi", "Tank01", "AnalogLimit.Hi"));
|
||||
ExpectMsg<NativeAlarmTransitionUpdate>(u => u.Transition.AlarmTypeName == "AnalogLimit.Hi");
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public void SubscribeAlarms_WithNullFilter_DeliversAllTypes()
|
||||
{
|
||||
var (adapter, getCb) = BuildAlarmAdapter();
|
||||
var actor = Sys.ActorOf(Props.Create(() => new DataConnectionActor(
|
||||
"conn", adapter, _options, _health, _factory, "OpcUa")));
|
||||
|
||||
actor.Tell(new SubscribeAlarmsRequest("c", "inst", "conn", "Tank01", null, DateTimeOffset.UtcNow));
|
||||
ExpectMsg<SubscribeAlarmsResponse>(m => m.Success);
|
||||
var cb = getCb();
|
||||
Assert.NotNull(cb);
|
||||
|
||||
cb!(Raise("Tank01.HiHi", "Tank01", "AnalogLimit.HiHi"));
|
||||
ExpectMsg<NativeAlarmTransitionUpdate>(u => u.Transition.AlarmTypeName == "AnalogLimit.HiHi");
|
||||
cb!(Raise("Tank01.Lo", "Tank01", "DiscreteAlarm"));
|
||||
ExpectMsg<NativeAlarmTransitionUpdate>(u => u.Transition.AlarmTypeName == "DiscreteAlarm");
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public void SubscribeAlarms_FilterMatch_IgnoresCaseAndWhitespace()
|
||||
{
|
||||
var (adapter, getCb) = BuildAlarmAdapter();
|
||||
var actor = Sys.ActorOf(Props.Create(() => new DataConnectionActor(
|
||||
"conn", adapter, _options, _health, _factory, "OpcUa")));
|
||||
|
||||
actor.Tell(new SubscribeAlarmsRequest("c", "inst", "conn", "Tank01",
|
||||
" analoglimit.hi ,\tDISCRETEALARM ", DateTimeOffset.UtcNow));
|
||||
ExpectMsg<SubscribeAlarmsResponse>(m => m.Success);
|
||||
var cb = getCb();
|
||||
Assert.NotNull(cb);
|
||||
|
||||
cb!(Raise("Tank01.Hi", "Tank01", "AnalogLimit.Hi")); // case differs from filter
|
||||
ExpectMsg<NativeAlarmTransitionUpdate>(u => u.Transition.AlarmTypeName == "AnalogLimit.Hi");
|
||||
cb!(Raise("Tank01.Disc", "Tank01", "DiscreteAlarm"));
|
||||
ExpectMsg<NativeAlarmTransitionUpdate>(u => u.Transition.AlarmTypeName == "DiscreteAlarm");
|
||||
cb!(Raise("Tank01.HiHi", "Tank01", "AnalogLimit.HiHi")); // not listed
|
||||
ExpectNoMsg(TimeSpan.FromMilliseconds(250));
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public void SubscribeAlarms_GatewayWideFeed_IsFilteredClientSide()
|
||||
{
|
||||
// MxGateway has no server-side filter: its adapter opens ONE gateway-wide
|
||||
// feed and the actor is the authoritative gate. A filtered source must
|
||||
// only see its own matching types even though the feed carries everything.
|
||||
var (adapter, getCb) = BuildAlarmAdapter();
|
||||
var actor = Sys.ActorOf(Props.Create(() => new DataConnectionActor(
|
||||
"conn", adapter, _options, _health, _factory, "MxGateway")));
|
||||
|
||||
actor.Tell(new SubscribeAlarmsRequest("c", "inst", "conn", "Reactor",
|
||||
"HighTemp", DateTimeOffset.UtcNow));
|
||||
ExpectMsg<SubscribeAlarmsResponse>(m => m.Success);
|
||||
var cb = getCb();
|
||||
Assert.NotNull(cb);
|
||||
|
||||
// Gateway-wide feed delivers a transition for a different source object —
|
||||
// dropped by source routing.
|
||||
cb!(Raise("Pump.Fault", "Pump", "HighTemp"));
|
||||
ExpectNoMsg(TimeSpan.FromMilliseconds(200));
|
||||
// Right source, wrong type — dropped by the client-side type gate.
|
||||
cb!(Raise("Reactor.LowTemp", "Reactor", "LowTemp"));
|
||||
ExpectNoMsg(TimeSpan.FromMilliseconds(200));
|
||||
// Right source, right type — delivered.
|
||||
cb!(Raise("Reactor.HighTemp", "Reactor", "HighTemp"));
|
||||
ExpectMsg<NativeAlarmTransitionUpdate>(u =>
|
||||
u.Transition.SourceObjectReference == "Reactor" && u.Transition.AlarmTypeName == "HighTemp");
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public void SubscribeAlarms_WithFilter_StillForwardsSnapshotCompleteSentinel()
|
||||
{
|
||||
// The SnapshotComplete framing sentinel (empty AlarmTypeName) must survive
|
||||
// the type gate so the NativeAlarmActor's snapshot swap can complete.
|
||||
var (adapter, getCb) = BuildAlarmAdapter();
|
||||
var actor = Sys.ActorOf(Props.Create(() => new DataConnectionActor(
|
||||
"conn", adapter, _options, _health, _factory, "OpcUa")));
|
||||
|
||||
actor.Tell(new SubscribeAlarmsRequest("c", "inst", "conn", "Tank01",
|
||||
"AnalogLimit.Hi", DateTimeOffset.UtcNow));
|
||||
ExpectMsg<SubscribeAlarmsResponse>(m => m.Success);
|
||||
var cb = getCb();
|
||||
Assert.NotNull(cb);
|
||||
|
||||
// Snapshot-complete sentinel: empty source refs (the framing marker) but
|
||||
// routed because every subscriber receives it; never type-filtered.
|
||||
cb!(new NativeAlarmTransition("Tank01", "Tank01", "", AlarmTransitionKind.SnapshotComplete,
|
||||
new AlarmConditionState(false, true, null, AlarmShelveState.Unshelved, false, 0),
|
||||
"", "", "", "", "", null, DateTimeOffset.UtcNow, "", ""));
|
||||
ExpectMsg<NativeAlarmTransitionUpdate>(u => u.Transition.Kind == AlarmTransitionKind.SnapshotComplete);
|
||||
}
|
||||
}
|
||||
|
||||
@@ -1,3 +1,4 @@
|
||||
using ZB.MOM.WW.MxGateway.Client;
|
||||
using ZB.MOM.WW.MxGateway.Contracts.Proto;
|
||||
using ZB.MOM.WW.ScadaBridge.DataConnectionLayer.Adapters;
|
||||
using CommonsTransitionKind = ZB.MOM.WW.ScadaBridge.Commons.Types.Enums.AlarmTransitionKind;
|
||||
@@ -63,4 +64,91 @@ public class MxGatewayAlarmMapperTests
|
||||
Assert.False(t.Condition.Acknowledged);
|
||||
Assert.Equal(1000, t.Condition.Severity);
|
||||
}
|
||||
|
||||
// ── CurrentValue / LimitValue (M2.13 / #27) ──────────────────────────────
|
||||
|
||||
[Fact]
|
||||
public void MapTransition_CurrentAndLimitValue_PopulatedFromProto()
|
||||
{
|
||||
// The gateway proto OnAlarmTransitionEvent carries current_value and
|
||||
// limit_value as MxValue union fields. Verify both are mapped through
|
||||
// MxValueToString into the neutral NativeAlarmTransition strings.
|
||||
var ev = new OnAlarmTransitionEvent
|
||||
{
|
||||
AlarmFullReference = "Tank01.Level.HiHi",
|
||||
SourceObjectReference = "Tank01",
|
||||
AlarmTypeName = "AnalogLimitAlarm.HiHi",
|
||||
TransitionKind = ProtoTransitionKind.Raise,
|
||||
Severity = 800,
|
||||
CurrentValue = 95.3.ToMxValue(),
|
||||
LimitValue = 90.0.ToMxValue()
|
||||
};
|
||||
|
||||
var t = MxGatewayAlarmMapper.MapTransition(ev);
|
||||
|
||||
Assert.Equal("95.3", t.CurrentValue);
|
||||
Assert.Equal("90", t.LimitValue);
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public void MapTransition_AbsentCurrentAndLimitValue_YieldsEmpty()
|
||||
{
|
||||
// When the gateway sends events without current/limit value fields (optional),
|
||||
// the resulting transition must have empty strings — never null.
|
||||
var ev = new OnAlarmTransitionEvent
|
||||
{
|
||||
AlarmFullReference = "Tank01.Level.Hi",
|
||||
SourceObjectReference = "Tank01",
|
||||
AlarmTypeName = "AnalogLimitAlarm.Hi",
|
||||
TransitionKind = ProtoTransitionKind.Raise,
|
||||
Severity = 600
|
||||
// CurrentValue and LimitValue not set → proto default (null reference)
|
||||
};
|
||||
|
||||
var t = MxGatewayAlarmMapper.MapTransition(ev);
|
||||
|
||||
Assert.Equal("", t.CurrentValue);
|
||||
Assert.Equal("", t.LimitValue);
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public void MapSnapshot_CurrentAndLimitValue_PopulatedFromProto()
|
||||
{
|
||||
// ActiveAlarmSnapshot also carries current_value and limit_value.
|
||||
var snap = new ActiveAlarmSnapshot
|
||||
{
|
||||
AlarmFullReference = "Pump01.Vibration.HiHi",
|
||||
SourceObjectReference = "Pump01",
|
||||
AlarmTypeName = "AnalogLimitAlarm.HiHi",
|
||||
CurrentState = ProtoConditionState.Active,
|
||||
Severity = 900,
|
||||
CurrentValue = 12.7.ToMxValue(),
|
||||
LimitValue = 10.0.ToMxValue()
|
||||
};
|
||||
|
||||
var t = MxGatewayAlarmMapper.MapSnapshot(snap);
|
||||
|
||||
Assert.Equal("12.7", t.CurrentValue);
|
||||
Assert.Equal("10", t.LimitValue);
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public void MapSnapshot_StringMxValue_ProducesStringCurrentValue()
|
||||
{
|
||||
// MxValue can carry string values (e.g. for discrete/string-type tags).
|
||||
var snap = new ActiveAlarmSnapshot
|
||||
{
|
||||
AlarmFullReference = "Mode.Alarm",
|
||||
SourceObjectReference = "Mode",
|
||||
AlarmTypeName = "DiscreteAlarm",
|
||||
CurrentState = ProtoConditionState.Active,
|
||||
Severity = 500,
|
||||
CurrentValue = "FAULT".ToMxValue()
|
||||
};
|
||||
|
||||
var t = MxGatewayAlarmMapper.MapSnapshot(snap);
|
||||
|
||||
Assert.Equal("FAULT", t.CurrentValue);
|
||||
Assert.Equal("", t.LimitValue); // not set
|
||||
}
|
||||
}
|
||||
|
||||
@@ -55,4 +55,54 @@ public class OpcUaAlarmMapperTests
|
||||
{
|
||||
Assert.Equal(expected, OpcUaAlarmMapper.MapShelve(name));
|
||||
}
|
||||
|
||||
// ── PickLimitValue (M2.13 / #27) ─────────────────────────────────────────
|
||||
|
||||
[Fact]
|
||||
public void PickLimitValue_AllNull_ReturnsEmpty()
|
||||
{
|
||||
// All four limit fields absent (non-limit alarm type) → empty string.
|
||||
Assert.Equal("", OpcUaAlarmMapper.PickLimitValue(null, null, null, null));
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public void PickLimitValue_HighHighLimitPresent_ReturnsIt()
|
||||
{
|
||||
// HighHighLimit takes top priority; other fields are null (absent).
|
||||
var result = OpcUaAlarmMapper.PickLimitValue(100.5, null, null, null);
|
||||
Assert.Equal("100.5", result);
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public void PickLimitValue_OnlyHighLimit_ReturnsHighLimit()
|
||||
{
|
||||
// Only HighLimit present (HighHighLimit absent on this alarm type).
|
||||
var result = OpcUaAlarmMapper.PickLimitValue(null, 80.0, null, null);
|
||||
Assert.Equal("80", result);
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public void PickLimitValue_PriorityOrder_HighHighWinsOverHigh()
|
||||
{
|
||||
// When multiple limits are present, HighHighLimit takes precedence.
|
||||
var result = OpcUaAlarmMapper.PickLimitValue(95.0, 80.0, 20.0, 5.0);
|
||||
Assert.Equal("95", result);
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public void PickLimitValue_OnlyLowLow_ReturnsLowLow()
|
||||
{
|
||||
// LowLowLimit only — last in priority, but should still be returned.
|
||||
var result = OpcUaAlarmMapper.PickLimitValue(null, null, null, -10.5);
|
||||
Assert.Equal("-10.5", result);
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public void PickLimitValue_UsesInvariantCulture()
|
||||
{
|
||||
// Decimal separator must always be '.' regardless of thread culture.
|
||||
var result = OpcUaAlarmMapper.PickLimitValue(1.5, null, null, null);
|
||||
Assert.Contains('.', result); // invariant culture: '.' not ','
|
||||
Assert.Equal("1.5", result);
|
||||
}
|
||||
}
|
||||
|
||||
+63
@@ -0,0 +1,63 @@
|
||||
using Opc.Ua;
|
||||
using ZB.MOM.WW.ScadaBridge.DataConnectionLayer.Adapters;
|
||||
|
||||
namespace ZB.MOM.WW.ScadaBridge.DataConnectionLayer.Tests;
|
||||
|
||||
/// <summary>
|
||||
/// M2.4 (#8) regression: standard OPC UA A&C events carry an event-type
|
||||
/// <see cref="NodeId"/> (e.g. <c>i=9341</c> for ExclusiveLevelAlarmType), but the
|
||||
/// client-side conditionFilter gate — and the server-side WhereClause — both key off
|
||||
/// the friendly type names in <see cref="RealOpcUaClient.KnownConditionTypeIds"/>.
|
||||
/// <see cref="RealOpcUaClient.ResolveAlarmTypeName"/> bridges the two by resolving the
|
||||
/// event-type NodeId back to its friendly name (NodeId-string fallback for custom
|
||||
/// types), so a friendly-name filter actually matches the events the server delivers.
|
||||
/// </summary>
|
||||
public class RealOpcUaClientAlarmFilterTests
|
||||
{
|
||||
[Fact]
|
||||
public void ResolveAlarmTypeName_KnownStandardNodeId_ReturnsFriendlyName()
|
||||
{
|
||||
// The well-known NodeId for ExclusiveLevelAlarmType (i=9341) must resolve to
|
||||
// the friendly name the conditionFilter/WhereClause use.
|
||||
var resolved = RealOpcUaClient.ResolveAlarmTypeName(ObjectTypeIds.ExclusiveLevelAlarmType);
|
||||
Assert.Equal("ExclusiveLevelAlarmType", resolved);
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public void ResolveAlarmTypeName_DiscreteAlarmNodeId_ReturnsFriendlyName()
|
||||
{
|
||||
var resolved = RealOpcUaClient.ResolveAlarmTypeName(ObjectTypeIds.DiscreteAlarmType);
|
||||
Assert.Equal("DiscreteAlarmType", resolved);
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public void ResolveAlarmTypeName_UnknownCustomNodeId_ReturnsNodeIdString()
|
||||
{
|
||||
// A vendor/custom subtype not in KnownConditionTypeIds: we cannot map it to a
|
||||
// friendly name, so we fall back to its NodeId string. This is consistent —
|
||||
// the WhereClause is also omitted for unknown names, so the client gate matches
|
||||
// the NodeId string, which is the only thing such a filter could carry.
|
||||
var custom = new NodeId(987654u, 7);
|
||||
var resolved = RealOpcUaClient.ResolveAlarmTypeName(custom);
|
||||
Assert.Equal(custom.ToString(), resolved);
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public void ResolveAlarmTypeName_Null_ReturnsEmptyString()
|
||||
{
|
||||
Assert.Equal("", RealOpcUaClient.ResolveAlarmTypeName(null));
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public void InverseMap_RoundTrips_EveryKnownConditionType()
|
||||
{
|
||||
// The friendly→NodeId map (KnownConditionTypeIds) and the NodeId→friendly map
|
||||
// are derived from a single source of truth, so they must round-trip for every
|
||||
// entry — guards against the two maps drifting apart.
|
||||
foreach (var (friendlyName, nodeId) in RealOpcUaClient.KnownConditionTypeIds)
|
||||
{
|
||||
var resolved = RealOpcUaClient.ResolveAlarmTypeName(nodeId);
|
||||
Assert.Equal(friendlyName, resolved);
|
||||
}
|
||||
}
|
||||
}
|
||||
+2
@@ -22,6 +22,8 @@
|
||||
uses a plain [Fact] — it never needs the server.
|
||||
-->
|
||||
<PackageReference Include="Xunit.SkippableFact" />
|
||||
<!-- MxGateway.Client brings MxValueExtensions (ToClrValue) used by MxGatewayAlarmMapper tests. -->
|
||||
<PackageReference Include="ZB.MOM.WW.MxGateway.Client" />
|
||||
</ItemGroup>
|
||||
|
||||
<ItemGroup>
|
||||
|
||||
+122
@@ -0,0 +1,122 @@
|
||||
using NSubstitute;
|
||||
using ZB.MOM.WW.ScadaBridge.Commons.Entities.Instances;
|
||||
using ZB.MOM.WW.ScadaBridge.Commons.Entities.Sites;
|
||||
using ZB.MOM.WW.ScadaBridge.Commons.Entities.Templates;
|
||||
using ZB.MOM.WW.ScadaBridge.Commons.Interfaces.Repositories;
|
||||
using ZB.MOM.WW.ScadaBridge.Commons.Types.Enums;
|
||||
using ZB.MOM.WW.ScadaBridge.Commons.Types.Flattening;
|
||||
using ZB.MOM.WW.ScadaBridge.DeploymentManager;
|
||||
using ZB.MOM.WW.ScadaBridge.TemplateEngine.Flattening;
|
||||
using ZB.MOM.WW.ScadaBridge.TemplateEngine.Validation;
|
||||
|
||||
namespace ZB.MOM.WW.ScadaBridge.DeploymentManager.Tests;
|
||||
|
||||
/// <summary>
|
||||
/// M2.8 (#23): proves the deploy path (FlatteningPipeline.FlattenAndValidateAsync)
|
||||
/// opts into connection-binding enforcement, so a data-sourced attribute with no
|
||||
/// binding gates the deployment as an ERROR (not just a warning), and that a binding
|
||||
/// resolving to a connection that actually exists at the target site passes.
|
||||
/// </summary>
|
||||
public class FlatteningPipelineConnectionBindingTests
|
||||
{
|
||||
private const int InstanceId = 1;
|
||||
private const int TemplateId = 10;
|
||||
private const int SiteId = 100;
|
||||
private const int ConnectionId = 7;
|
||||
|
||||
private readonly ITemplateEngineRepository _templateRepo = Substitute.For<ITemplateEngineRepository>();
|
||||
private readonly ISiteRepository _siteRepo = Substitute.For<ISiteRepository>();
|
||||
private readonly FlatteningPipeline _sut;
|
||||
|
||||
public FlatteningPipelineConnectionBindingTests()
|
||||
{
|
||||
_sut = new FlatteningPipeline(
|
||||
_templateRepo,
|
||||
_siteRepo,
|
||||
new FlatteningService(),
|
||||
new ValidationService(),
|
||||
new RevisionHashService());
|
||||
}
|
||||
|
||||
/// <summary>
|
||||
/// Seeds a single-template chain with one data-sourced attribute ("Temp") and a
|
||||
/// site that owns a single "PlantBus" data connection. The instance optionally
|
||||
/// binds "Temp" to <paramref name="boundConnectionId"/>.
|
||||
/// </summary>
|
||||
private void Arrange(int? boundConnectionId)
|
||||
{
|
||||
var template = new Template("Tank") { Id = TemplateId };
|
||||
template.Attributes.Add(new TemplateAttribute("Temp")
|
||||
{
|
||||
DataType = DataType.Double,
|
||||
DataSourceReference = "ns=2;s=Temp"
|
||||
});
|
||||
|
||||
var instance = new Instance("Tank-01") { Id = InstanceId, TemplateId = TemplateId, SiteId = SiteId };
|
||||
if (boundConnectionId.HasValue)
|
||||
{
|
||||
instance.ConnectionBindings.Add(new InstanceConnectionBinding("Temp")
|
||||
{
|
||||
InstanceId = InstanceId,
|
||||
DataConnectionId = boundConnectionId.Value
|
||||
});
|
||||
}
|
||||
|
||||
_templateRepo.GetInstanceByIdAsync(InstanceId, Arg.Any<CancellationToken>()).Returns(instance);
|
||||
_templateRepo.GetTemplateWithChildrenAsync(TemplateId, Arg.Any<CancellationToken>()).Returns(template);
|
||||
_templateRepo.GetCompositionsByTemplateIdAsync(TemplateId, Arg.Any<CancellationToken>()).Returns([]);
|
||||
_templateRepo.GetAllSharedScriptsAsync(Arg.Any<CancellationToken>()).Returns([]);
|
||||
|
||||
var connection = new DataConnection("PlantBus", "OpcUa", SiteId) { Id = ConnectionId };
|
||||
_siteRepo.GetDataConnectionsBySiteIdAsync(SiteId, Arg.Any<CancellationToken>())
|
||||
.Returns([connection]);
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public async Task FlattenAndValidate_DataSourcedAttributeWithNoBinding_ReportsBindingError()
|
||||
{
|
||||
Arrange(boundConnectionId: null);
|
||||
|
||||
var result = await _sut.FlattenAndValidateAsync(InstanceId);
|
||||
|
||||
Assert.True(result.IsSuccess);
|
||||
Assert.False(result.Value.Validation.IsValid);
|
||||
Assert.Contains(result.Value.Validation.Errors,
|
||||
e => e.Category == ValidationCategory.ConnectionBinding);
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public async Task FlattenAndValidate_BindingToExistingSiteConnection_NoBindingError()
|
||||
{
|
||||
Arrange(boundConnectionId: ConnectionId);
|
||||
|
||||
var result = await _sut.FlattenAndValidateAsync(InstanceId);
|
||||
|
||||
Assert.True(result.IsSuccess);
|
||||
Assert.DoesNotContain(result.Value.Validation.Errors,
|
||||
e => e.Category == ValidationCategory.ConnectionBinding);
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public async Task FlattenAndValidate_BindingToStaleDeletedConnection_ReportsBindingError()
|
||||
{
|
||||
// M2.8 (#23): FlatteningService.ApplyConnectionBindings silently drops a
|
||||
// binding whose DataConnectionId doesn't resolve to any loaded site
|
||||
// DataConnection (stale / deleted connection). The flattener leaves
|
||||
// BoundDataConnectionId == null, so the validator treats the attribute as
|
||||
// unbound and gates the deployment with a ConnectionBinding Error.
|
||||
//
|
||||
// Arrange: the instance binding points at id 999, but the site only has
|
||||
// the connection with id=ConnectionId (7). The flattener can't resolve 999
|
||||
// and drops the binding silently; the validator then flags it.
|
||||
const int StaleConnectionId = 999;
|
||||
Arrange(boundConnectionId: StaleConnectionId);
|
||||
|
||||
var result = await _sut.FlattenAndValidateAsync(InstanceId);
|
||||
|
||||
Assert.True(result.IsSuccess);
|
||||
Assert.False(result.Value.Validation.IsValid);
|
||||
Assert.Contains(result.Value.Validation.Errors,
|
||||
e => e.Category == ValidationCategory.ConnectionBinding);
|
||||
}
|
||||
}
|
||||
+102
@@ -0,0 +1,102 @@
|
||||
using NSubstitute;
|
||||
using ZB.MOM.WW.ScadaBridge.Commons.Entities.Instances;
|
||||
using ZB.MOM.WW.ScadaBridge.Commons.Entities.Sites;
|
||||
using ZB.MOM.WW.ScadaBridge.Commons.Entities.Templates;
|
||||
using ZB.MOM.WW.ScadaBridge.Commons.Interfaces.Repositories;
|
||||
using ZB.MOM.WW.ScadaBridge.Commons.Types.Flattening;
|
||||
using ZB.MOM.WW.ScadaBridge.DeploymentManager;
|
||||
using ZB.MOM.WW.ScadaBridge.TemplateEngine.Flattening;
|
||||
using ZB.MOM.WW.ScadaBridge.TemplateEngine.Validation;
|
||||
|
||||
namespace ZB.MOM.WW.ScadaBridge.DeploymentManager.Tests;
|
||||
|
||||
/// <summary>
|
||||
/// M2.1 (#22): proves the FlatteningPipeline actually computes the alarm-capable
|
||||
/// connection set from the loaded site data connections and threads it through
|
||||
/// ValidationService → SemanticValidator. Before the fix the pipeline loaded the
|
||||
/// connections but never passed the capable set, so the native-alarm-source
|
||||
/// capability check (built but inert) never ran in production — a source bound to
|
||||
/// a non-alarm-capable connection deployed silently.
|
||||
/// </summary>
|
||||
public class FlatteningPipelineNativeAlarmCapabilityTests
|
||||
{
|
||||
private const int InstanceId = 1;
|
||||
private const int TemplateId = 10;
|
||||
private const int SiteId = 100;
|
||||
|
||||
private readonly ITemplateEngineRepository _templateRepo = Substitute.For<ITemplateEngineRepository>();
|
||||
private readonly ISiteRepository _siteRepo = Substitute.For<ISiteRepository>();
|
||||
private readonly FlatteningPipeline _sut;
|
||||
|
||||
public FlatteningPipelineNativeAlarmCapabilityTests()
|
||||
{
|
||||
_sut = new FlatteningPipeline(
|
||||
_templateRepo,
|
||||
_siteRepo,
|
||||
new FlatteningService(),
|
||||
new ValidationService(),
|
||||
new RevisionHashService());
|
||||
}
|
||||
|
||||
/// <summary>
|
||||
/// Seeds a single-template chain whose only template carries one native alarm
|
||||
/// source bound to <paramref name="connectionName"/>, and a site that owns a
|
||||
/// single data connection of <paramref name="connectionProtocol"/>.
|
||||
/// </summary>
|
||||
private void Arrange(string connectionName, string connectionProtocol, string boundConnectionName)
|
||||
{
|
||||
var template = new Template("Tank") { Id = TemplateId };
|
||||
template.NativeAlarmSources.Add(new TemplateNativeAlarmSource("BoilerAlarms")
|
||||
{
|
||||
ConnectionName = boundConnectionName,
|
||||
SourceReference = "ns=2;s=Boiler",
|
||||
});
|
||||
|
||||
var instance = new Instance("Tank-01") { Id = InstanceId, TemplateId = TemplateId, SiteId = SiteId };
|
||||
|
||||
_templateRepo.GetInstanceByIdAsync(InstanceId, Arg.Any<CancellationToken>()).Returns(instance);
|
||||
_templateRepo.GetTemplateWithChildrenAsync(TemplateId, Arg.Any<CancellationToken>()).Returns(template);
|
||||
_templateRepo.GetCompositionsByTemplateIdAsync(TemplateId, Arg.Any<CancellationToken>())
|
||||
.Returns([]);
|
||||
_templateRepo.GetAllSharedScriptsAsync(Arg.Any<CancellationToken>())
|
||||
.Returns([]);
|
||||
|
||||
var connection = new DataConnection(connectionName, connectionProtocol, SiteId) { Id = 7 };
|
||||
_siteRepo.GetDataConnectionsBySiteIdAsync(SiteId, Arg.Any<CancellationToken>())
|
||||
.Returns([connection]);
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public async Task FlattenAndValidate_NativeAlarmSourceOnNonAlarmCapableConnection_ReportsCapabilityError()
|
||||
{
|
||||
// A "Modbus" connection is NOT alarm-capable (no IAlarmSubscribableConnection adapter).
|
||||
Arrange(connectionName: "PlantBus", connectionProtocol: "Modbus", boundConnectionName: "PlantBus");
|
||||
|
||||
var result = await _sut.FlattenAndValidateAsync(InstanceId);
|
||||
|
||||
Assert.True(result.IsSuccess);
|
||||
Assert.Contains(result.Value.Validation.Errors,
|
||||
e => e.Category == ValidationCategory.NativeAlarmSourceInvalid
|
||||
&& e.Message.Contains("alarm-capable"));
|
||||
}
|
||||
|
||||
[Theory]
|
||||
[InlineData("OpcUa")]
|
||||
[InlineData("MxGateway")]
|
||||
// Case variants: IsAlarmCapable uses OrdinalIgnoreCase, matching DataConnectionFactory's
|
||||
// own OrdinalIgnoreCase protocol-key lookup; lock the contract with non-canonical casing.
|
||||
[InlineData("OPCUA")]
|
||||
[InlineData("opcua")]
|
||||
[InlineData("mxgateway")]
|
||||
[InlineData("MXGATEWAY")]
|
||||
public async Task FlattenAndValidate_NativeAlarmSourceOnAlarmCapableConnection_NoCapabilityError(string protocol)
|
||||
{
|
||||
Arrange(connectionName: "Boiler", connectionProtocol: protocol, boundConnectionName: "Boiler");
|
||||
|
||||
var result = await _sut.FlattenAndValidateAsync(InstanceId);
|
||||
|
||||
Assert.True(result.IsSuccess);
|
||||
Assert.DoesNotContain(result.Value.Validation.Errors,
|
||||
e => e.Category == ValidationCategory.NativeAlarmSourceInvalid);
|
||||
}
|
||||
}
|
||||
@@ -100,7 +100,14 @@ public class DatabaseGatewayTests
|
||||
var sf = new ZB.MOM.WW.ScadaBridge.StoreAndForward.StoreAndForwardService(
|
||||
storage, sfOptions, NullLogger<ZB.MOM.WW.ScadaBridge.StoreAndForward.StoreAndForwardService>.Instance);
|
||||
|
||||
var gateway = new DatabaseGateway(_repository, NullLogger<DatabaseGateway>.Instance, storeAndForward: sf);
|
||||
// M2.3 (#7): CachedWriteAsync now attempts the write immediately and
|
||||
// only buffers on a TRANSIENT failure. The stub forces a transient
|
||||
// outcome so this test exercises the buffering path deterministically
|
||||
// without a real SQL Server.
|
||||
var gateway = new ExecuteStubGateway(
|
||||
_repository,
|
||||
sf,
|
||||
onExecute: () => throw new TransientDatabaseException("deadlock", errorNumber: 1205));
|
||||
|
||||
// Audit Log #23 (ExecutionId Task 4): a known execution id / source
|
||||
// script so the gateway -> EnqueueAsync hop can be asserted below.
|
||||
@@ -157,7 +164,11 @@ public class DatabaseGatewayTests
|
||||
var sf = new ZB.MOM.WW.ScadaBridge.StoreAndForward.StoreAndForwardService(
|
||||
storage, sfOptions, NullLogger<ZB.MOM.WW.ScadaBridge.StoreAndForward.StoreAndForwardService>.Instance);
|
||||
|
||||
var gateway = new DatabaseGateway(_repository, NullLogger<DatabaseGateway>.Instance, storeAndForward: sf);
|
||||
// M2.3 (#7): force a transient outcome so the write reaches S&F.
|
||||
var gateway = new ExecuteStubGateway(
|
||||
_repository,
|
||||
sf,
|
||||
onExecute: () => throw new TransientDatabaseException("deadlock", errorNumber: 1205));
|
||||
|
||||
await gateway.CachedWriteAsync("testDb", "INSERT INTO t VALUES (1)");
|
||||
|
||||
@@ -167,6 +178,377 @@ public class DatabaseGatewayTests
|
||||
Assert.NotEqual(0, maxRetries);
|
||||
}
|
||||
|
||||
// ── M2.3 (#7): transient-vs-permanent SQL classification on the immediate
|
||||
// cached-write attempt + the buffered retry path ──
|
||||
|
||||
/// <summary>
|
||||
/// Builds a real, initialised in-memory store-and-forward service plus a
|
||||
/// keep-alive connection (the SQLite shared-cache DB lives only while a
|
||||
/// connection is open). The caller disposes <paramref name="keepAlive"/>.
|
||||
/// </summary>
|
||||
private static (ZB.MOM.WW.ScadaBridge.StoreAndForward.StoreAndForwardService Sf, string ConnStr, Microsoft.Data.Sqlite.SqliteConnection KeepAlive)
|
||||
NewStoreAndForward()
|
||||
{
|
||||
var dbName = $"EsgCachedWriteClassify_{Guid.NewGuid():N}";
|
||||
var connStr = $"Data Source={dbName};Mode=Memory;Cache=Shared";
|
||||
var keepAlive = new Microsoft.Data.Sqlite.SqliteConnection(connStr);
|
||||
keepAlive.Open();
|
||||
var storage = new ZB.MOM.WW.ScadaBridge.StoreAndForward.StoreAndForwardStorage(
|
||||
connStr, NullLogger<ZB.MOM.WW.ScadaBridge.StoreAndForward.StoreAndForwardStorage>.Instance);
|
||||
storage.InitializeAsync().GetAwaiter().GetResult();
|
||||
var sfOptions = new ZB.MOM.WW.ScadaBridge.StoreAndForward.StoreAndForwardOptions
|
||||
{
|
||||
DefaultMaxRetries = 99,
|
||||
DefaultRetryInterval = TimeSpan.FromMinutes(10),
|
||||
RetryTimerInterval = TimeSpan.FromMinutes(10),
|
||||
};
|
||||
var sf = new ZB.MOM.WW.ScadaBridge.StoreAndForward.StoreAndForwardService(
|
||||
storage, sfOptions, NullLogger<ZB.MOM.WW.ScadaBridge.StoreAndForward.StoreAndForwardService>.Instance);
|
||||
return (sf, connStr, keepAlive);
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public async Task CachedWrite_PermanentSqlError_ReturnsFailedSynchronously_NotBuffered()
|
||||
{
|
||||
// A constraint/syntax/permission failure on the IMMEDIATE attempt must
|
||||
// be returned to the script as Failed and must NOT be buffered — mirrors
|
||||
// ExternalSystemClient.CachedCallAsync's PermanentExternalSystemException
|
||||
// path.
|
||||
var conn = new DatabaseConnectionDefinition("testDb", "Server=localhost;Database=test") { Id = 1 };
|
||||
StubConnection(conn);
|
||||
|
||||
var (sf, connStr, keepAlive) = NewStoreAndForward();
|
||||
using var _ = keepAlive;
|
||||
|
||||
var gateway = new ExecuteStubGateway(
|
||||
_repository,
|
||||
sf,
|
||||
onExecute: () => throw new PermanentDatabaseException(
|
||||
"Violation of PRIMARY KEY constraint", errorNumber: 2627));
|
||||
|
||||
var result = await gateway.CachedWriteAsync("testDb", "INSERT INTO t VALUES (1)");
|
||||
|
||||
Assert.False(result.Success);
|
||||
Assert.False(result.WasBuffered);
|
||||
Assert.NotNull(result.ErrorMessage);
|
||||
|
||||
// Nothing buffered — the permanent failure short-circuited S&F.
|
||||
Assert.Equal(0, ReadBufferDepth(connStr));
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public async Task CachedWrite_TransientSqlError_BuffersToStoreAndForward()
|
||||
{
|
||||
// A deadlock / timeout on the IMMEDIATE attempt is transient — the write
|
||||
// is handed to S&F (WasBuffered=true), not returned as Failed.
|
||||
var conn = new DatabaseConnectionDefinition("testDb", "Server=localhost;Database=test")
|
||||
{
|
||||
Id = 1,
|
||||
MaxRetries = 5,
|
||||
RetryDelay = TimeSpan.FromSeconds(12),
|
||||
};
|
||||
StubConnection(conn);
|
||||
|
||||
var (sf, connStr, keepAlive) = NewStoreAndForward();
|
||||
using var _ = keepAlive;
|
||||
|
||||
var gateway = new ExecuteStubGateway(
|
||||
_repository,
|
||||
sf,
|
||||
onExecute: () => throw new TransientDatabaseException(
|
||||
"Transaction was deadlocked", errorNumber: 1205));
|
||||
|
||||
var result = await gateway.CachedWriteAsync(
|
||||
"testDb", "UPDATE t SET v = 1", new Dictionary<string, object?> { ["x"] = 1 });
|
||||
|
||||
Assert.True(result.Success); // accepted for delivery
|
||||
Assert.True(result.WasBuffered); // handed to S&F, not synchronously failed
|
||||
Assert.Null(result.ErrorMessage);
|
||||
|
||||
Assert.Equal(1, ReadBufferDepth(connStr));
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public async Task CachedWrite_ImmediateSuccess_NotBuffered_ReturnsDelivered()
|
||||
{
|
||||
// A write that succeeds immediately is done — it must NOT be buffered,
|
||||
// and the result reports success (WasBuffered=false), mirroring the API
|
||||
// path's immediate-success behaviour.
|
||||
var conn = new DatabaseConnectionDefinition("testDb", "Server=localhost;Database=test") { Id = 1 };
|
||||
StubConnection(conn);
|
||||
|
||||
var (sf, connStr, keepAlive) = NewStoreAndForward();
|
||||
using var _ = keepAlive;
|
||||
|
||||
var gateway = new ExecuteStubGateway(_repository, sf, onExecute: () => { /* succeeds */ });
|
||||
|
||||
var result = await gateway.CachedWriteAsync("testDb", "INSERT INTO t VALUES (1)");
|
||||
|
||||
Assert.True(result.Success);
|
||||
Assert.False(result.WasBuffered);
|
||||
Assert.Null(result.ErrorMessage);
|
||||
|
||||
Assert.Equal(0, ReadBufferDepth(connStr));
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public async Task DeliverBuffered_TransientSqlError_RethrowsSoEngineRetries()
|
||||
{
|
||||
// On the retry path a transient failure must propagate so the S&F engine
|
||||
// schedules another retry — mirrors ExternalSystemClient.DeliverBuffered
|
||||
// letting TransientExternalSystemException escape.
|
||||
var conn = new DatabaseConnectionDefinition("testDb", "Server=localhost;Database=test") { Id = 1 };
|
||||
StubConnection(conn);
|
||||
|
||||
var gateway = new ExecuteStubGateway(
|
||||
_repository,
|
||||
storeAndForward: null,
|
||||
onExecute: () => throw new TransientDatabaseException("timeout", errorNumber: -2));
|
||||
|
||||
var message = new ZB.MOM.WW.ScadaBridge.StoreAndForward.StoreAndForwardMessage
|
||||
{
|
||||
Id = Guid.NewGuid().ToString("N"),
|
||||
Category = ZB.MOM.WW.ScadaBridge.Commons.Types.Enums.StoreAndForwardCategory.CachedDbWrite,
|
||||
Target = "testDb",
|
||||
PayloadJson =
|
||||
"""{"ConnectionName":"testDb","Sql":"INSERT INTO t VALUES (1)","Parameters":null}""",
|
||||
};
|
||||
|
||||
await Assert.ThrowsAsync<TransientDatabaseException>(
|
||||
() => gateway.DeliverBufferedAsync(message));
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public async Task DeliverBuffered_PermanentSqlError_ReturnsFalseSoMessageParks()
|
||||
{
|
||||
// On the retry path a permanent failure must park the message (return
|
||||
// false) rather than retry forever — mirrors ExternalSystemClient.
|
||||
// DeliverBuffered returning false on PermanentExternalSystemException.
|
||||
var conn = new DatabaseConnectionDefinition("testDb", "Server=localhost;Database=test") { Id = 1 };
|
||||
StubConnection(conn);
|
||||
|
||||
var gateway = new ExecuteStubGateway(
|
||||
_repository,
|
||||
storeAndForward: null,
|
||||
onExecute: () => throw new PermanentDatabaseException(
|
||||
"Invalid column name", errorNumber: 207));
|
||||
|
||||
var message = new ZB.MOM.WW.ScadaBridge.StoreAndForward.StoreAndForwardMessage
|
||||
{
|
||||
Id = Guid.NewGuid().ToString("N"),
|
||||
Category = ZB.MOM.WW.ScadaBridge.Commons.Types.Enums.StoreAndForwardCategory.CachedDbWrite,
|
||||
Target = "testDb",
|
||||
PayloadJson =
|
||||
"""{"ConnectionName":"testDb","Sql":"INSERT INTO t VALUES (1)","Parameters":null}""",
|
||||
};
|
||||
|
||||
var delivered = await gateway.DeliverBufferedAsync(message);
|
||||
|
||||
Assert.False(delivered); // permanent — the S&F engine parks the message
|
||||
}
|
||||
|
||||
// ── M2.3 (#7) code-review fix: ExecuteWriteAsync must classify NON-SqlException
|
||||
// DB outages as transient (buffer+retry) and propagate cancellation —
|
||||
// mirroring the HTTP path's ordered catches in InvokeHttpAsync. The pre-fix
|
||||
// code only caught SqlException, so a live outage surfacing as
|
||||
// InvalidOperationException / SocketException / IOException / TimeoutException
|
||||
// escaped unclassified and crashed the Script Execution Actor instead of
|
||||
// buffering. These tests drive the RAW execution seam (RunSqlAsync) so the
|
||||
// PRODUCTION classification in ExecuteWriteAsync runs end-to-end. ──
|
||||
|
||||
public static IEnumerable<object[]> TransientNonSqlOutages()
|
||||
{
|
||||
// A live DB outage that surfaces as a non-SqlException: connection-state,
|
||||
// socket, IO, and timeout failures are all retryable transport errors.
|
||||
yield return new object[] { new InvalidOperationException("The connection is not open.") };
|
||||
yield return new object[] { new System.Net.Sockets.SocketException(10061 /* connection refused */) };
|
||||
yield return new object[] { new System.IO.IOException("Unable to read data from the transport connection.") };
|
||||
yield return new object[] { new TimeoutException("The operation has timed out.") };
|
||||
}
|
||||
|
||||
[Theory]
|
||||
[MemberData(nameof(TransientNonSqlOutages))]
|
||||
public async Task CachedWrite_NonSqlOutage_ClassifiedTransient_BuffersNotCrash(Exception outage)
|
||||
{
|
||||
// [1] A live outage that is NOT a SqlException must be classified TRANSIENT
|
||||
// (buffered for retry), NOT escape unclassified to crash the script actor,
|
||||
// and NOT be returned as a permanent Failed result.
|
||||
var conn = new DatabaseConnectionDefinition("testDb", "Server=localhost;Database=test")
|
||||
{
|
||||
Id = 1,
|
||||
MaxRetries = 5,
|
||||
RetryDelay = TimeSpan.FromSeconds(12),
|
||||
};
|
||||
StubConnection(conn);
|
||||
|
||||
var (sf, connStr, keepAlive) = NewStoreAndForward();
|
||||
using var _ = keepAlive;
|
||||
|
||||
// RawExecuteStubGateway routes the raw throw through the PRODUCTION
|
||||
// ExecuteWriteAsync classification (the seam under test), unlike
|
||||
// ExecuteStubGateway which throws an already-classified exception.
|
||||
var gateway = new RawExecuteStubGateway(_repository, sf, onRunSql: () => throw outage);
|
||||
|
||||
var result = await gateway.CachedWriteAsync("testDb", "INSERT INTO t VALUES (1)");
|
||||
|
||||
Assert.True(result.Success); // accepted for delivery, not a crash
|
||||
Assert.True(result.WasBuffered); // handed to S&F as transient
|
||||
Assert.Null(result.ErrorMessage); // not a permanent Failed result
|
||||
|
||||
Assert.Equal(1, ReadBufferDepth(connStr));
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public async Task CachedWrite_CancellationRequested_PropagatesOperationCanceled_NotReclassified()
|
||||
{
|
||||
// [2] OperationCanceledException raised while the caller's token is
|
||||
// cancelled must propagate UNCHANGED — never reclassified as a transient
|
||||
// DB error and never buffered. Mirrors the HTTP path's first catch:
|
||||
// `catch (OperationCanceledException) when (cancellationToken.IsCancellationRequested) throw;`
|
||||
var conn = new DatabaseConnectionDefinition("testDb", "Server=localhost;Database=test") { Id = 1 };
|
||||
StubConnection(conn);
|
||||
|
||||
var (sf, connStr, keepAlive) = NewStoreAndForward();
|
||||
using var _ = keepAlive;
|
||||
|
||||
using var cts = new CancellationTokenSource();
|
||||
cts.Cancel();
|
||||
|
||||
var gateway = new RawExecuteStubGateway(
|
||||
_repository, sf, onRunSql: () => throw new OperationCanceledException(cts.Token));
|
||||
|
||||
await Assert.ThrowsAsync<OperationCanceledException>(
|
||||
() => gateway.CachedWriteAsync("testDb", "INSERT INTO t VALUES (1)", cancellationToken: cts.Token));
|
||||
|
||||
// Cancellation is not a transient failure — nothing must have been buffered.
|
||||
Assert.Equal(0, ReadBufferDepth(connStr));
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public async Task CachedWrite_UnexpectedException_Propagates_NotClassifiedTransient()
|
||||
{
|
||||
// An exception type outside the transient transport set (e.g.
|
||||
// ArgumentException) is NOT a DB outage — it must propagate, exactly as
|
||||
// the HTTP path lets genuinely-unexpected exceptions escape past
|
||||
// `catch (Exception ex) when (ErrorClassifier.IsTransient(ex))`.
|
||||
var conn = new DatabaseConnectionDefinition("testDb", "Server=localhost;Database=test") { Id = 1 };
|
||||
StubConnection(conn);
|
||||
|
||||
var (sf, connStr, keepAlive) = NewStoreAndForward();
|
||||
using var _ = keepAlive;
|
||||
|
||||
var gateway = new RawExecuteStubGateway(
|
||||
_repository, sf, onRunSql: () => throw new ArgumentException("authoring bug"));
|
||||
|
||||
await Assert.ThrowsAsync<ArgumentException>(
|
||||
() => gateway.CachedWriteAsync("testDb", "INSERT INTO t VALUES (1)"));
|
||||
|
||||
Assert.Equal(0, ReadBufferDepth(connStr));
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public async Task DeliverBuffered_NonSqlOutage_RethrowsAsTransient_SoEngineRetries()
|
||||
{
|
||||
// [1] on the RETRY path: a non-SqlException outage during delivery must be
|
||||
// classified transient and propagate (as TransientDatabaseException) so
|
||||
// the S&F engine schedules another retry — it must NOT crash/park.
|
||||
var conn = new DatabaseConnectionDefinition("testDb", "Server=localhost;Database=test") { Id = 1 };
|
||||
StubConnection(conn);
|
||||
|
||||
var gateway = new RawExecuteStubGateway(
|
||||
_repository,
|
||||
storeAndForward: null,
|
||||
onRunSql: () => throw new InvalidOperationException("The connection is not open."));
|
||||
|
||||
var message = new ZB.MOM.WW.ScadaBridge.StoreAndForward.StoreAndForwardMessage
|
||||
{
|
||||
Id = Guid.NewGuid().ToString("N"),
|
||||
Category = ZB.MOM.WW.ScadaBridge.Commons.Types.Enums.StoreAndForwardCategory.CachedDbWrite,
|
||||
Target = "testDb",
|
||||
PayloadJson =
|
||||
"""{"ConnectionName":"testDb","Sql":"INSERT INTO t VALUES (1)","Parameters":null}""",
|
||||
};
|
||||
|
||||
await Assert.ThrowsAsync<TransientDatabaseException>(
|
||||
() => gateway.DeliverBufferedAsync(message));
|
||||
}
|
||||
|
||||
/// <summary>
|
||||
/// Reads the current buffered-message count off the S&F SQLite DB by
|
||||
/// counting <c>sf_messages</c> rows (the engine's persistence table).
|
||||
/// </summary>
|
||||
private static int ReadBufferDepth(string connStr)
|
||||
{
|
||||
using var conn = new Microsoft.Data.Sqlite.SqliteConnection(connStr);
|
||||
conn.Open();
|
||||
using var cmd = conn.CreateCommand();
|
||||
cmd.CommandText = "SELECT COUNT(*) FROM sf_messages";
|
||||
return Convert.ToInt32(cmd.ExecuteScalar());
|
||||
}
|
||||
|
||||
/// <summary>
|
||||
/// Test gateway that substitutes the SQL-execution seam so a test can drive
|
||||
/// success / transient / permanent outcomes without a real SQL Server (and
|
||||
/// without fabricating a <see cref="Microsoft.Data.SqlClient.SqlException"/>,
|
||||
/// which has no public constructor). Production classifies a real
|
||||
/// <c>SqlException</c> into <see cref="TransientDatabaseException"/> /
|
||||
/// <see cref="PermanentDatabaseException"/> at this same seam.
|
||||
/// </summary>
|
||||
private sealed class ExecuteStubGateway : DatabaseGateway
|
||||
{
|
||||
private readonly Action _onExecute;
|
||||
|
||||
public ExecuteStubGateway(
|
||||
IExternalSystemRepository repository,
|
||||
ZB.MOM.WW.ScadaBridge.StoreAndForward.StoreAndForwardService? storeAndForward,
|
||||
Action onExecute)
|
||||
: base(repository, NullLogger<DatabaseGateway>.Instance, storeAndForward)
|
||||
=> _onExecute = onExecute;
|
||||
|
||||
internal override Task ExecuteWriteAsync(
|
||||
string connectionName,
|
||||
string connectionString,
|
||||
string sql,
|
||||
IReadOnlyDictionary<string, object?> parameters,
|
||||
CancellationToken cancellationToken)
|
||||
{
|
||||
_onExecute();
|
||||
return Task.CompletedTask;
|
||||
}
|
||||
}
|
||||
|
||||
/// <summary>
|
||||
/// Test gateway that substitutes the INNER SQL-execution seam
|
||||
/// (<c>RunSqlAsync</c>) so a test can throw RAW exceptions (a real outage
|
||||
/// shape: <see cref="InvalidOperationException"/>, <see cref="System.Net.Sockets.SocketException"/>,
|
||||
/// etc.) and have them flow through the PRODUCTION
|
||||
/// <c>ExecuteWriteAsync</c> classification (the catch ordering under test) —
|
||||
/// unlike <see cref="ExecuteStubGateway"/>, which throws an
|
||||
/// already-classified <see cref="TransientDatabaseException"/> /
|
||||
/// <see cref="PermanentDatabaseException"/> and so bypasses the catches.
|
||||
/// </summary>
|
||||
private sealed class RawExecuteStubGateway : DatabaseGateway
|
||||
{
|
||||
private readonly Action _onRunSql;
|
||||
|
||||
public RawExecuteStubGateway(
|
||||
IExternalSystemRepository repository,
|
||||
ZB.MOM.WW.ScadaBridge.StoreAndForward.StoreAndForwardService? storeAndForward,
|
||||
Action onRunSql)
|
||||
: base(repository, NullLogger<DatabaseGateway>.Instance, storeAndForward)
|
||||
=> _onRunSql = onRunSql;
|
||||
|
||||
internal override Task RunSqlAsync(
|
||||
string connectionString,
|
||||
string sql,
|
||||
IReadOnlyDictionary<string, object?> parameters,
|
||||
CancellationToken cancellationToken)
|
||||
{
|
||||
_onRunSql();
|
||||
return Task.CompletedTask;
|
||||
}
|
||||
}
|
||||
|
||||
private static (int MaxRetries, long RetryIntervalMs, Guid? ExecutionId, string? SourceScript)
|
||||
ReadBufferedRetrySettings(string connStr)
|
||||
{
|
||||
|
||||
@@ -0,0 +1,105 @@
|
||||
using System.Data.Common;
|
||||
|
||||
namespace ZB.MOM.WW.ScadaBridge.ExternalSystemGateway.Tests;
|
||||
|
||||
/// <summary>
|
||||
/// M2.3 (#7): unit tests for the transient-vs-permanent SQL error-number
|
||||
/// classifier that <c>DatabaseGateway</c> uses to decide whether a failed
|
||||
/// cached write should be buffered (transient) or returned to the script
|
||||
/// synchronously / parked (permanent).
|
||||
/// </summary>
|
||||
public class SqlErrorClassifierTests
|
||||
{
|
||||
// The full transient set documented on SqlErrorClassifier — connection,
|
||||
// timeout, deadlock, and Azure throttle error numbers. A retry can plausibly
|
||||
// succeed for any of these, so they are buffered to store-and-forward.
|
||||
[Theory]
|
||||
[InlineData(-2)] // timeout expired
|
||||
[InlineData(-1)] // connection error
|
||||
[InlineData(2)] // network / instance not found
|
||||
[InlineData(53)] // network path not found
|
||||
[InlineData(64)] // connection terminated mid-session
|
||||
[InlineData(233)] // no process on the other end of the pipe
|
||||
[InlineData(1205)] // deadlock victim
|
||||
[InlineData(10053)] // transport-level abort
|
||||
[InlineData(10054)] // connection reset by peer
|
||||
[InlineData(10060)] // connection timed out
|
||||
[InlineData(40197)] // Azure SQL service error, retry
|
||||
[InlineData(40501)] // Azure SQL service busy
|
||||
[InlineData(40613)] // Azure SQL database unavailable
|
||||
[InlineData(49918)] // Azure SQL cannot process request (throttle)
|
||||
[InlineData(49919)] // Azure SQL too many create/update operations
|
||||
[InlineData(49920)] // Azure SQL too many operations (throttle)
|
||||
public void IsTransient_KnownTransientNumber_ReturnsTrue(int errorNumber)
|
||||
{
|
||||
Assert.True(SqlErrorClassifier.IsTransient(errorNumber));
|
||||
}
|
||||
|
||||
// Constraint, syntax, and permission errors are permanent — retrying the
|
||||
// identical statement cannot succeed and may cause duplicate side effects.
|
||||
[Theory]
|
||||
[InlineData(547)] // constraint violation (FK/CHECK)
|
||||
[InlineData(2627)] // primary-key / unique constraint violation
|
||||
[InlineData(2601)] // duplicate key in a unique index
|
||||
[InlineData(102)] // incorrect syntax
|
||||
[InlineData(156)] // incorrect syntax near a keyword
|
||||
[InlineData(207)] // invalid column name
|
||||
[InlineData(208)] // invalid object name
|
||||
[InlineData(229)] // permission denied on object
|
||||
[InlineData(230)] // permission denied on column
|
||||
[InlineData(262)] // permission denied (CREATE etc.)
|
||||
public void IsTransient_KnownPermanentNumber_ReturnsFalse(int errorNumber)
|
||||
{
|
||||
Assert.False(SqlErrorClassifier.IsTransient(errorNumber));
|
||||
}
|
||||
|
||||
[Theory]
|
||||
[InlineData(0)] // no error number captured
|
||||
[InlineData(99999)] // unknown / undocumented number
|
||||
[InlineData(12345)]
|
||||
[InlineData(int.MaxValue)]
|
||||
public void IsTransient_UnknownNumber_DefaultsToPermanent(int errorNumber)
|
||||
{
|
||||
// Fail-fast is the safer default: an unrecognised error number must NOT
|
||||
// be silently retried forever. Unknown => permanent => false.
|
||||
Assert.False(SqlErrorClassifier.IsTransient(errorNumber));
|
||||
}
|
||||
|
||||
// ── M2.3 (#7) code-review fix: IsTransient(Exception) — a live DB outage does
|
||||
// not always surface as a SqlException. Transport/connection/timeout/driver
|
||||
// exception types are transient (buffer+retry), mirroring the HTTP path's
|
||||
// ErrorClassifier.IsTransient(Exception). ──
|
||||
|
||||
public static IEnumerable<object[]> TransientExceptionTypes()
|
||||
{
|
||||
yield return new object[] { new InvalidOperationException("connection not open") };
|
||||
yield return new object[] { new System.IO.IOException("transport reset") };
|
||||
yield return new object[] { new System.Net.Sockets.SocketException(10060) };
|
||||
yield return new object[] { new TimeoutException("timed out") };
|
||||
yield return new object[] { new TaskCanceledException("driver-level cancellation") };
|
||||
// Any DbException that is NOT a SqlException is a driver/transport error.
|
||||
yield return new object[] { new NonSqlDbException("provider transport error") };
|
||||
}
|
||||
|
||||
[Theory]
|
||||
[MemberData(nameof(TransientExceptionTypes))]
|
||||
public void IsTransient_Exception_TrueForTransportTypes(Exception ex)
|
||||
{
|
||||
Assert.True(SqlErrorClassifier.IsTransient(ex));
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public void IsTransient_Exception_FalseForUnexpectedType()
|
||||
{
|
||||
// Authoring bugs are NOT a DB outage — they must propagate, exactly as the
|
||||
// HTTP path lets genuinely-unexpected exceptions escape its IsTransient filter.
|
||||
Assert.False(SqlErrorClassifier.IsTransient(new ArgumentException("authoring bug")));
|
||||
Assert.False(SqlErrorClassifier.IsTransient(new NullReferenceException()));
|
||||
}
|
||||
|
||||
/// <summary>A concrete <see cref="DbException"/> that is not a SqlException, for the classifier unit test.</summary>
|
||||
private sealed class NonSqlDbException : DbException
|
||||
{
|
||||
public NonSqlDbException(string message) : base(message) { }
|
||||
}
|
||||
}
|
||||
+48
@@ -0,0 +1,48 @@
|
||||
using Microsoft.Extensions.DependencyInjection;
|
||||
using Microsoft.Extensions.Hosting;
|
||||
using Microsoft.Extensions.Logging;
|
||||
using Microsoft.Extensions.Logging.Abstractions;
|
||||
|
||||
namespace ZB.MOM.WW.ScadaBridge.HealthMonitoring.Tests;
|
||||
|
||||
/// <summary>
|
||||
/// M2.16 (#30) idempotency regression — code-review finding on commit d81f747.
|
||||
/// <para>
|
||||
/// <see cref="ServiceCollectionExtensions.AddSiteEventLogHealthMetricsBridge"/> uses a
|
||||
/// factory-lambda overload of <c>AddHostedService</c>, which sets only
|
||||
/// <c>ImplementationFactory</c> and leaves <c>ImplementationType</c> null. The original
|
||||
/// <c>ImplementationType ==</c> guard was therefore a silent no-op: a second call would spin
|
||||
/// up a second <see cref="SiteEventLogFailureCountReporter"/> (two timers both polling).
|
||||
/// The fix uses a private marker singleton whose <c>ServiceType</c> is always set.
|
||||
/// </para>
|
||||
/// </summary>
|
||||
public class AddSiteEventLogHealthMetricsBridgeTests
|
||||
{
|
||||
[Fact]
|
||||
public void AddSiteEventLogHealthMetricsBridge_IsIdempotent_DoesNotDoubleRegister_HostedService()
|
||||
{
|
||||
// M2.16 (#30): calling the bridge method twice must register exactly one
|
||||
// SiteEventLogFailureCountReporter. Without the marker-type guard the
|
||||
// ImplementationType == check was a no-op for factory-lambda registrations,
|
||||
// so the second call would have added a second hosted service (two timers).
|
||||
var services = new ServiceCollection();
|
||||
services.AddSingleton<ILoggerFactory, NullLoggerFactory>();
|
||||
services.AddSingleton(typeof(ILogger<>), typeof(NullLogger<>));
|
||||
services.AddHealthMonitoring();
|
||||
|
||||
Func<IServiceProvider, Func<long>> factory = _ => () => 0L;
|
||||
|
||||
services.AddSiteEventLogHealthMetricsBridge(factory);
|
||||
services.AddSiteEventLogHealthMetricsBridge(factory);
|
||||
|
||||
// Count IHostedService descriptors whose factory produces a
|
||||
// SiteEventLogFailureCountReporter. Because it is factory-registered,
|
||||
// ImplementationType is null — we count by resolving and checking type.
|
||||
using var provider = services.BuildServiceProvider();
|
||||
var reporters = provider.GetServices<IHostedService>()
|
||||
.OfType<SiteEventLogFailureCountReporter>()
|
||||
.ToList();
|
||||
|
||||
Assert.Single(reporters);
|
||||
}
|
||||
}
|
||||
+77
@@ -0,0 +1,77 @@
|
||||
using Microsoft.Extensions.Logging.Abstractions;
|
||||
|
||||
namespace ZB.MOM.WW.ScadaBridge.HealthMonitoring.Tests;
|
||||
|
||||
/// <summary>
|
||||
/// M2.16 (#30) — unit tests for <see cref="SiteEventLogFailureCountReporter"/>.
|
||||
/// Verifies that the poller reads the count provided by the
|
||||
/// <see cref="Func{TResult}"/> delegate and pushes it into
|
||||
/// <see cref="ISiteHealthCollector.SetSiteEventLogWriteFailures"/>.
|
||||
/// </summary>
|
||||
public class SiteEventLogFailureCountReporterTests
|
||||
{
|
||||
[Fact]
|
||||
public async Task StartAsync_ImmediatelyProbes_FailedWriteCount()
|
||||
{
|
||||
// Arrange
|
||||
var count = 99L;
|
||||
var collector = new SiteHealthCollector();
|
||||
using var reporter = new SiteEventLogFailureCountReporter(
|
||||
failedWriteCountProvider: () => count,
|
||||
collector: collector,
|
||||
logger: NullLogger<SiteEventLogFailureCountReporter>.Instance,
|
||||
refreshInterval: TimeSpan.FromHours(1)); // long interval — only immediate tick matters
|
||||
|
||||
// Act
|
||||
await reporter.StartAsync(CancellationToken.None);
|
||||
|
||||
// Give the background Task a moment to execute its synchronous immediate probe.
|
||||
var deadline = DateTime.UtcNow.AddSeconds(5);
|
||||
while (collector.CollectReport("probe").SiteEventLogWriteFailures == 0L
|
||||
&& DateTime.UtcNow < deadline)
|
||||
{
|
||||
await Task.Delay(10);
|
||||
}
|
||||
|
||||
// Assert — the immediate probe before the first Delay must have fired.
|
||||
var report = collector.CollectReport("site-1");
|
||||
Assert.Equal(99L, report.SiteEventLogWriteFailures);
|
||||
|
||||
await reporter.StopAsync(CancellationToken.None);
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public async Task StartAsync_PushesLatestCount_OnEachTick()
|
||||
{
|
||||
// Arrange — start with count 5; advance to 12 after the first tick.
|
||||
var count = 5L;
|
||||
var collector = new SiteHealthCollector();
|
||||
using var reporter = new SiteEventLogFailureCountReporter(
|
||||
failedWriteCountProvider: () => count,
|
||||
collector: collector,
|
||||
logger: NullLogger<SiteEventLogFailureCountReporter>.Instance,
|
||||
refreshInterval: TimeSpan.FromMilliseconds(50));
|
||||
|
||||
await reporter.StartAsync(CancellationToken.None);
|
||||
|
||||
// Wait for immediate probe.
|
||||
var deadline = DateTime.UtcNow.AddSeconds(5);
|
||||
while (collector.CollectReport("probe").SiteEventLogWriteFailures != 5L
|
||||
&& DateTime.UtcNow < deadline)
|
||||
await Task.Delay(10);
|
||||
|
||||
Assert.Equal(5L, collector.CollectReport("site-1").SiteEventLogWriteFailures);
|
||||
|
||||
// Advance the counter and wait for the next tick to push the new value.
|
||||
count = 12L;
|
||||
|
||||
deadline = DateTime.UtcNow.AddSeconds(5);
|
||||
while (collector.CollectReport("probe").SiteEventLogWriteFailures != 12L
|
||||
&& DateTime.UtcNow < deadline)
|
||||
await Task.Delay(10);
|
||||
|
||||
Assert.Equal(12L, collector.CollectReport("site-1").SiteEventLogWriteFailures);
|
||||
|
||||
await reporter.StopAsync(CancellationToken.None);
|
||||
}
|
||||
}
|
||||
+62
@@ -0,0 +1,62 @@
|
||||
namespace ZB.MOM.WW.ScadaBridge.HealthMonitoring.Tests;
|
||||
|
||||
/// <summary>
|
||||
/// M2.16 (#30) regression coverage. <see cref="ISiteEventLogger.FailedWriteCount"/>
|
||||
/// is a cumulative (point-in-time) counter. A periodic
|
||||
/// <c>SiteEventLogFailureCountReporter</c> hosted service polls the count and
|
||||
/// pushes it into the collector via
|
||||
/// <see cref="ISiteHealthCollector.SetSiteEventLogWriteFailures"/> so the next
|
||||
/// <see cref="ISiteHealthCollector.CollectReport"/> includes it in the report
|
||||
/// payload as <c>SiteEventLogWriteFailures</c>. Unlike the per-interval
|
||||
/// SiteAuditWriteFailures counter, this value is NOT reset on collect — it
|
||||
/// carries forward whatever the most recent poller push delivered.
|
||||
/// </summary>
|
||||
public class SiteEventLogWriteFailuresMetricTests
|
||||
{
|
||||
private readonly SiteHealthCollector _collector = new();
|
||||
|
||||
[Fact]
|
||||
public void Set_Then_CollectReport_IncludesCount()
|
||||
{
|
||||
_collector.SetSiteEventLogWriteFailures(17L);
|
||||
|
||||
var report = _collector.CollectReport("site-1");
|
||||
|
||||
Assert.Equal(17L, report.SiteEventLogWriteFailures);
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public void Report_Payload_Includes_SiteEventLogWriteFailures_AsZeroByDefault()
|
||||
{
|
||||
var report = _collector.CollectReport("site-1");
|
||||
|
||||
Assert.Equal(0L, report.SiteEventLogWriteFailures);
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public void CollectReport_DoesNotReset_SiteEventLogWriteFailures()
|
||||
{
|
||||
// This is a point-in-time cumulative count — successive CollectReport
|
||||
// calls before the next poller tick MUST carry forward the same value
|
||||
// rather than resetting to zero (which would falsely indicate no failures
|
||||
// between the two reports).
|
||||
_collector.SetSiteEventLogWriteFailures(42L);
|
||||
|
||||
var first = _collector.CollectReport("site-1");
|
||||
var second = _collector.CollectReport("site-1");
|
||||
|
||||
Assert.Equal(42L, first.SiteEventLogWriteFailures);
|
||||
Assert.Equal(42L, second.SiteEventLogWriteFailures);
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public void Set_Overwrites_Previous_Value()
|
||||
{
|
||||
_collector.SetSiteEventLogWriteFailures(5L);
|
||||
_collector.SetSiteEventLogWriteFailures(9L);
|
||||
|
||||
var report = _collector.CollectReport("site-1");
|
||||
|
||||
Assert.Equal(9L, report.SiteEventLogWriteFailures);
|
||||
}
|
||||
}
|
||||
+1
@@ -11,6 +11,7 @@
|
||||
<ItemGroup>
|
||||
<PackageReference Include="coverlet.collector" />
|
||||
<PackageReference Include="Microsoft.Data.Sqlite" />
|
||||
<PackageReference Include="Microsoft.Extensions.DependencyInjection" />
|
||||
<PackageReference Include="Microsoft.Extensions.Logging.Abstractions" />
|
||||
<PackageReference Include="Microsoft.Extensions.Options" />
|
||||
<PackageReference Include="Microsoft.NET.Test.Sdk" />
|
||||
|
||||
@@ -35,6 +35,11 @@ public class CentralActorPathTests : IAsyncLifetime
|
||||
// env var is visible to StartupValidator.Validate() at Program.cs line 42.
|
||||
Environment.SetEnvironmentVariable("ScadaBridge__InboundApi__ApiKeyPepper",
|
||||
CentralDbTestEnvironment.TestPepper);
|
||||
// Supply MachineDataDb so the reverted Host-008 Require (REQ-HOST-3/4, M2.9 #17)
|
||||
// passes for Central-role StartupValidator. A non-empty placeholder satisfies
|
||||
// the preflight; the DI override below replaces the real DbContext anyway.
|
||||
Environment.SetEnvironmentVariable("ScadaBridge__Database__MachineDataDb",
|
||||
"Server=localhost;Database=MachineData;");
|
||||
|
||||
_factory = new WebApplicationFactory<Program>()
|
||||
.WithWebHostBuilder(builder =>
|
||||
@@ -94,6 +99,7 @@ public class CentralActorPathTests : IAsyncLifetime
|
||||
_factory?.Dispose();
|
||||
Environment.SetEnvironmentVariable("DOTNET_ENVIRONMENT", _previousEnv);
|
||||
Environment.SetEnvironmentVariable("ScadaBridge__InboundApi__ApiKeyPepper", null);
|
||||
Environment.SetEnvironmentVariable("ScadaBridge__Database__MachineDataDb", null);
|
||||
await Task.CompletedTask;
|
||||
}
|
||||
|
||||
|
||||
@@ -101,6 +101,11 @@ public class CentralAuditWiringTests : IDisposable
|
||||
// runs before WithWebHostBuilder.ConfigureAppConfiguration applies DI config.
|
||||
Environment.SetEnvironmentVariable("ScadaBridge__InboundApi__ApiKeyPepper",
|
||||
CentralDbTestEnvironment.TestPepper);
|
||||
// Supply MachineDataDb so the reverted Host-008 Require (REQ-HOST-3/4, M2.9 #17)
|
||||
// passes for Central-role StartupValidator. A non-empty placeholder satisfies
|
||||
// the preflight; the DI override below replaces the real DbContext anyway.
|
||||
Environment.SetEnvironmentVariable("ScadaBridge__Database__MachineDataDb",
|
||||
"Server=localhost;Database=MachineData;");
|
||||
|
||||
_factory = new WebApplicationFactory<Program>()
|
||||
.WithWebHostBuilder(builder =>
|
||||
@@ -156,6 +161,7 @@ public class CentralAuditWiringTests : IDisposable
|
||||
_factory.Dispose();
|
||||
Environment.SetEnvironmentVariable("DOTNET_ENVIRONMENT", _previousEnv);
|
||||
Environment.SetEnvironmentVariable("ScadaBridge__InboundApi__ApiKeyPepper", null);
|
||||
Environment.SetEnvironmentVariable("ScadaBridge__Database__MachineDataDb", null);
|
||||
}
|
||||
|
||||
[Fact]
|
||||
|
||||
@@ -10,8 +10,12 @@ namespace ZB.MOM.WW.ScadaBridge.Host.Tests;
|
||||
///
|
||||
/// Also supplies <c>ScadaBridge__InboundApi__ApiKeyPepper</c> so the Central-role
|
||||
/// StartupValidator preflight (added in 1fcc4f5) does not fail for tests that set
|
||||
/// <c>DOTNET_ENVIRONMENT=Central</c> without an explicit pepper env var. Both vars
|
||||
/// are restored on Dispose so tests stay isolated.
|
||||
/// <c>DOTNET_ENVIRONMENT=Central</c> without an explicit pepper env var.
|
||||
///
|
||||
/// Also supplies <c>ScadaBridge__Database__MachineDataDb</c> so the Central-role
|
||||
/// StartupValidator preflight (reverts Host-008, REQ-HOST-3/4, M2.9 #17) does not
|
||||
/// fail for tests that set <c>DOTNET_ENVIRONMENT=Central</c> without an explicit
|
||||
/// MachineDataDb env var. All vars are restored on Dispose so tests stay isolated.
|
||||
/// </summary>
|
||||
internal sealed class CentralDbTestEnvironment : IDisposable
|
||||
{
|
||||
@@ -22,6 +26,11 @@ internal sealed class CentralDbTestEnvironment : IDisposable
|
||||
|
||||
private const string ConfigKey = "ScadaBridge__Database__ConfigurationDb";
|
||||
|
||||
private const string MachineDataDb =
|
||||
"Server=localhost,1433;Database=ScadaBridgeMachineData;User Id=scadabridge_app;Password=ScadaBridge_Dev1#;TrustServerCertificate=true";
|
||||
|
||||
private const string MachineDataKey = "ScadaBridge__Database__MachineDataDb";
|
||||
|
||||
// Test-only pepper — satisfies the ≥16-char StartupValidator requirement without
|
||||
// committing a real secret. The env-var name uses the double-underscore delimiter
|
||||
// so AddEnvironmentVariables() maps it to ScadaBridge:InboundApi:ApiKeyPepper.
|
||||
@@ -29,6 +38,7 @@ internal sealed class CentralDbTestEnvironment : IDisposable
|
||||
private const string PepperKey = "ScadaBridge__InboundApi__ApiKeyPepper";
|
||||
|
||||
private readonly string? _previousConfig;
|
||||
private readonly string? _previousMachineData;
|
||||
private readonly string? _previousPepper;
|
||||
|
||||
public CentralDbTestEnvironment()
|
||||
@@ -36,6 +46,9 @@ internal sealed class CentralDbTestEnvironment : IDisposable
|
||||
_previousConfig = Environment.GetEnvironmentVariable(ConfigKey);
|
||||
Environment.SetEnvironmentVariable(ConfigKey, ConfigurationDb);
|
||||
|
||||
_previousMachineData = Environment.GetEnvironmentVariable(MachineDataKey);
|
||||
Environment.SetEnvironmentVariable(MachineDataKey, MachineDataDb);
|
||||
|
||||
_previousPepper = Environment.GetEnvironmentVariable(PepperKey);
|
||||
Environment.SetEnvironmentVariable(PepperKey, TestPepper);
|
||||
}
|
||||
@@ -43,6 +56,7 @@ internal sealed class CentralDbTestEnvironment : IDisposable
|
||||
public void Dispose()
|
||||
{
|
||||
Environment.SetEnvironmentVariable(ConfigKey, _previousConfig);
|
||||
Environment.SetEnvironmentVariable(MachineDataKey, _previousMachineData);
|
||||
Environment.SetEnvironmentVariable(PepperKey, _previousPepper);
|
||||
}
|
||||
}
|
||||
|
||||
@@ -95,6 +95,11 @@ public class CentralCompositionRootTests : IDisposable
|
||||
// runs before WithWebHostBuilder.ConfigureAppConfiguration applies DI config.
|
||||
Environment.SetEnvironmentVariable("ScadaBridge__InboundApi__ApiKeyPepper",
|
||||
CentralDbTestEnvironment.TestPepper);
|
||||
// Supply MachineDataDb so the reverted Host-008 Require (REQ-HOST-3/4, M2.9 #17)
|
||||
// passes for Central-role StartupValidator. A non-empty placeholder satisfies
|
||||
// the preflight; the DI override below replaces the real DbContext anyway.
|
||||
Environment.SetEnvironmentVariable("ScadaBridge__Database__MachineDataDb",
|
||||
"Server=localhost;Database=MachineData;");
|
||||
|
||||
_factory = new WebApplicationFactory<Program>()
|
||||
.WithWebHostBuilder(builder =>
|
||||
@@ -159,6 +164,7 @@ public class CentralCompositionRootTests : IDisposable
|
||||
_factory.Dispose();
|
||||
Environment.SetEnvironmentVariable("DOTNET_ENVIRONMENT", _previousEnv);
|
||||
Environment.SetEnvironmentVariable("ScadaBridge__InboundApi__ApiKeyPepper", null);
|
||||
Environment.SetEnvironmentVariable("ScadaBridge__Database__MachineDataDb", null);
|
||||
}
|
||||
|
||||
// --- Singletons ---
|
||||
@@ -399,6 +405,9 @@ public class SiteCompositionRootTests : IDisposable
|
||||
new object[] { typeof(IEventLogQueryService) },
|
||||
new object[] { typeof(ISiteIdentityProvider) },
|
||||
new object[] { typeof(IHealthReportTransport) },
|
||||
// M2.15 (#29): the active-node purge gate must be registered on site nodes
|
||||
// so EventLogPurge only runs on the active node.
|
||||
new object[] { typeof(SiteEventLogActiveNodeCheck) },
|
||||
};
|
||||
|
||||
// --- Scoped services ---
|
||||
|
||||
@@ -158,6 +158,15 @@ public class HealthCheckTests : IDisposable
|
||||
Assert.Contains(ZbHealthTags.Ready, registrations["database"].Tags);
|
||||
Assert.Contains(ZbHealthTags.Ready, registrations["akka-cluster"].Tags);
|
||||
|
||||
// M2.14 (#28): readiness ALSO reflects "required cluster singletons running"
|
||||
// (REQ-HOST-4a). The Central-only required-singletons check is Ready-tagged so
|
||||
// it gates /health/ready alongside database + akka-cluster, but is leadership-
|
||||
// agnostic (it does NOT carry the Active tag), so a ready standby stays ready.
|
||||
Assert.True(registrations.ContainsKey("required-singletons"),
|
||||
"Expected a 'required-singletons' health check.");
|
||||
Assert.Contains(ZbHealthTags.Ready, registrations["required-singletons"].Tags);
|
||||
Assert.DoesNotContain(ZbHealthTags.Active, registrations["required-singletons"].Tags);
|
||||
|
||||
// The leader-only active-node check must NOT be on the readiness tier.
|
||||
Assert.DoesNotContain(ZbHealthTags.Ready, registrations["active-node"].Tags);
|
||||
}
|
||||
|
||||
@@ -0,0 +1,143 @@
|
||||
using Akka.Actor;
|
||||
using Akka.TestKit.Xunit2;
|
||||
using Microsoft.Extensions.DependencyInjection;
|
||||
using Microsoft.Extensions.Diagnostics.HealthChecks;
|
||||
using Microsoft.Extensions.Logging.Abstractions;
|
||||
using ZB.MOM.WW.ScadaBridge.Host.Health;
|
||||
|
||||
namespace ZB.MOM.WW.ScadaBridge.Host.Tests;
|
||||
|
||||
/// <summary>
|
||||
/// M2.14 (#28): unit tests for <see cref="RequiredSingletonsHealthCheck"/>.
|
||||
///
|
||||
/// The check probes each required central singleton through its local
|
||||
/// <c>ClusterSingletonProxy</c> by Asking an <see cref="Identify"/> with a short
|
||||
/// bounded timeout and treating a non-null <see cref="ActorIdentity.Subject"/> as
|
||||
/// "reachable". These tests exercise that probe logic directly against a TestKit
|
||||
/// <see cref="ActorSystem"/>:
|
||||
/// <list type="bullet">
|
||||
/// <item>present + reachable proxy paths (live echo actors) → Healthy;</item>
|
||||
/// <item>a missing proxy path (ActorSelection resolves a null Subject) → Unhealthy
|
||||
/// naming the unreachable singleton.</item>
|
||||
/// </list>
|
||||
/// No WebApplicationFactory / DB / formed cluster is needed — the probe is just an
|
||||
/// in-process Identify round-trip, so the tests are deterministic and fast.
|
||||
/// </summary>
|
||||
public class RequiredSingletonsHealthCheckTests : TestKit
|
||||
{
|
||||
/// <summary>A minimal live actor that does nothing — its mere existence makes
|
||||
/// an <see cref="Identify"/> resolve a non-null Subject (i.e. "reachable").</summary>
|
||||
/// <remarks>No <c>Receive<Identify></c> handler is needed: Akka's
|
||||
/// <see cref="ActorBase"/> answers every <see cref="Identify"/> message with
|
||||
/// an <see cref="ActorIdentity"/> automatically, so an empty actor at the proxy
|
||||
/// path is sufficient to simulate a reachable singleton.</remarks>
|
||||
private sealed class EchoActor : ReceiveActor
|
||||
{
|
||||
}
|
||||
|
||||
private IServiceProvider ProviderReturning(ActorSystem system)
|
||||
{
|
||||
var services = new ServiceCollection();
|
||||
services.AddSingleton(system);
|
||||
return services.BuildServiceProvider();
|
||||
}
|
||||
|
||||
private static async Task<HealthCheckResult> RunAsync(RequiredSingletonsHealthCheck check)
|
||||
{
|
||||
var context = new HealthCheckContext
|
||||
{
|
||||
Registration = new HealthCheckRegistration(
|
||||
"required-singletons", check, failureStatus: null, tags: null),
|
||||
};
|
||||
return await check.CheckHealthAsync(context, CancellationToken.None);
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public async Task AllRequiredSingletonProxiesReachable_ReportsHealthy()
|
||||
{
|
||||
// Create a live actor at every required proxy path so each Identify resolves
|
||||
// a non-null Subject.
|
||||
foreach (var name in RequiredSingletonsHealthCheck.RequiredSingletonProxyNames)
|
||||
{
|
||||
Sys.ActorOf(Props.Create(() => new EchoActor()), name);
|
||||
}
|
||||
|
||||
var check = new RequiredSingletonsHealthCheck(
|
||||
ProviderReturning(Sys),
|
||||
NullLogger<RequiredSingletonsHealthCheck>.Instance);
|
||||
|
||||
var result = await RunAsync(check);
|
||||
|
||||
Assert.Equal(HealthStatus.Healthy, result.Status);
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public async Task OneRequiredSingletonUnreachable_ReportsUnhealthyNamingIt()
|
||||
{
|
||||
// Create all but one proxy. The missing one's ActorSelection resolves an
|
||||
// ActorIdentity with a null Subject within the bounded timeout → unreachable.
|
||||
var missing = RequiredSingletonsHealthCheck.RequiredSingletonProxyNames[0];
|
||||
foreach (var name in RequiredSingletonsHealthCheck.RequiredSingletonProxyNames)
|
||||
{
|
||||
if (name == missing)
|
||||
continue;
|
||||
Sys.ActorOf(Props.Create(() => new EchoActor()), name);
|
||||
}
|
||||
|
||||
var check = new RequiredSingletonsHealthCheck(
|
||||
ProviderReturning(Sys),
|
||||
NullLogger<RequiredSingletonsHealthCheck>.Instance);
|
||||
|
||||
var result = await RunAsync(check);
|
||||
|
||||
Assert.Equal(HealthStatus.Unhealthy, result.Status);
|
||||
Assert.NotNull(result.Description);
|
||||
Assert.Contains(missing, result.Description!);
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public async Task ActorSystemNotYetAvailable_ReportsUnhealthy_DoesNotThrow()
|
||||
{
|
||||
// Startup race: ActorSystem not yet bridged into DI. The check must map this
|
||||
// to Unhealthy (the node is not ready to serve) rather than throwing.
|
||||
var emptyProvider = new ServiceCollection().BuildServiceProvider();
|
||||
|
||||
var check = new RequiredSingletonsHealthCheck(
|
||||
emptyProvider,
|
||||
NullLogger<RequiredSingletonsHealthCheck>.Instance);
|
||||
|
||||
var result = await RunAsync(check);
|
||||
|
||||
Assert.Equal(HealthStatus.Unhealthy, result.Status);
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public async Task PreCancelledToken_ReportsUnhealthy_DoesNotThrow()
|
||||
{
|
||||
// Shutdown-race path: CheckHealthAsync is called with an already-cancelled
|
||||
// token (e.g. host is tearing down). The check must never throw — any
|
||||
// OperationCanceledException from Ask must be caught and mapped to Unhealthy.
|
||||
foreach (var name in RequiredSingletonsHealthCheck.RequiredSingletonProxyNames)
|
||||
{
|
||||
Sys.ActorOf(Props.Create(() => new EchoActor()), name);
|
||||
}
|
||||
|
||||
var check = new RequiredSingletonsHealthCheck(
|
||||
ProviderReturning(Sys),
|
||||
NullLogger<RequiredSingletonsHealthCheck>.Instance);
|
||||
|
||||
using var cts = new CancellationTokenSource();
|
||||
cts.Cancel(); // already cancelled before the check runs
|
||||
|
||||
var context = new HealthCheckContext
|
||||
{
|
||||
Registration = new HealthCheckRegistration(
|
||||
"required-singletons", check, failureStatus: null, tags: null),
|
||||
};
|
||||
|
||||
// Must not throw; an already-cancelled token → all probes fail → Unhealthy.
|
||||
var result = await check.CheckHealthAsync(context, cts.Token);
|
||||
|
||||
Assert.Equal(HealthStatus.Unhealthy, result.Status);
|
||||
}
|
||||
}
|
||||
@@ -20,6 +20,7 @@ public class StartupValidatorTests
|
||||
["ScadaBridge:Node:NodeHostname"] = "central-node1",
|
||||
["ScadaBridge:Node:RemotingPort"] = "8081",
|
||||
["ScadaBridge:Database:ConfigurationDb"] = "Server=localhost;Database=Config;",
|
||||
["ScadaBridge:Database:MachineDataDb"] = "Server=localhost;Database=MachineData;",
|
||||
["ScadaBridge:Security:Ldap:Server"] = "ldap.example.com",
|
||||
["ScadaBridge:Security:JwtSigningKey"] = "test-signing-key-at-least-32-chars-long",
|
||||
["ScadaBridge:Cluster:SeedNodes:0"] = "akka.tcp://scadabridge@central-node1:8081",
|
||||
@@ -152,17 +153,19 @@ public class StartupValidatorTests
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public void Central_MissingMachineDataDb_PassesValidation()
|
||||
public void Central_MissingMachineDataDb_FailsValidation()
|
||||
{
|
||||
// Host-008 regression: MachineDataDb is never consumed anywhere in the
|
||||
// system (only ConfigurationDb is wired into AddConfigurationDatabase).
|
||||
// It is no longer a required key, so its absence must not fail startup.
|
||||
// Reverts Host-008. REQ-HOST-3/REQ-HOST-4 require MachineDataDb to be
|
||||
// validated at startup for Central nodes, and the shipped docker appsettings
|
||||
// (docker/central-node-a/appsettings.Central.json and central-node-b) carry
|
||||
// the key. The prior Host-008 decision (which removed the Require) is reversed
|
||||
// here (#17, M2.9): a missing MachineDataDb must fail fast with a clear error.
|
||||
var values = ValidCentralConfig();
|
||||
values.Remove("ScadaBridge:Database:MachineDataDb");
|
||||
var config = BuildConfig(values);
|
||||
|
||||
var ex = Record.Exception(() => StartupValidator.Validate(config));
|
||||
Assert.Null(ex);
|
||||
var ex = Assert.Throws<InvalidOperationException>(() => StartupValidator.Validate(config));
|
||||
Assert.Contains("MachineDataDb connection string required for Central", ex.Message);
|
||||
}
|
||||
|
||||
[Fact]
|
||||
|
||||
@@ -1,12 +1,30 @@
|
||||
using System.Text.Json;
|
||||
using ZB.MOM.WW.ScadaBridge.Commons.Types.InboundApi;
|
||||
|
||||
namespace ZB.MOM.WW.ScadaBridge.InboundAPI.Tests;
|
||||
|
||||
/// <summary>
|
||||
/// WP-2: Tests for parameter validation — type checking, required fields, extended type system.
|
||||
/// WP-2 / InboundAPI-M2.6: tests for parameter validation — type checking,
|
||||
/// required fields, the extended type system, and RECURSIVE (nested Object /
|
||||
/// List element) type validation with path-qualified errors.
|
||||
///
|
||||
/// <para>
|
||||
/// Definitions are expressed as JSON Schema (the canonical persisted format
|
||||
/// produced by the Central UI / migration). The validator also accepts the
|
||||
/// legacy flat-array form; that backward-compat path is covered by the final
|
||||
/// region.
|
||||
/// </para>
|
||||
/// </summary>
|
||||
public class ParameterValidatorTests
|
||||
{
|
||||
private static JsonElement Body(string json)
|
||||
{
|
||||
using var doc = JsonDocument.Parse(json);
|
||||
return doc.RootElement.Clone();
|
||||
}
|
||||
|
||||
// ── No / empty definitions ────────────────────────────────────────────────
|
||||
|
||||
[Fact]
|
||||
public void NoDefinitions_NoBody_ReturnsValid()
|
||||
{
|
||||
@@ -16,21 +34,27 @@ public class ParameterValidatorTests
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public void EmptyDefinitions_ReturnsValid()
|
||||
public void EmptyObjectSchema_ReturnsValid()
|
||||
{
|
||||
var result = ParameterValidator.Validate(null, """{"type":"object","properties":{}}""");
|
||||
Assert.True(result.IsValid);
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public void EmptyLegacyArray_ReturnsValid()
|
||||
{
|
||||
var result = ParameterValidator.Validate(null, "[]");
|
||||
Assert.True(result.IsValid);
|
||||
}
|
||||
|
||||
// ── Required / body shape ──────────────────────────────────────────────────
|
||||
|
||||
[Fact]
|
||||
public void RequiredParameterMissing_ReturnsInvalid()
|
||||
{
|
||||
var definitions = JsonSerializer.Serialize(new[]
|
||||
{
|
||||
new { Name = "value", Type = "Integer", Required = true }
|
||||
});
|
||||
const string def = """{"type":"object","properties":{"value":{"type":"integer"}},"required":["value"]}""";
|
||||
|
||||
var result = ParameterValidator.Validate(null, definitions);
|
||||
var result = ParameterValidator.Validate(null, def);
|
||||
Assert.False(result.IsValid);
|
||||
Assert.Contains("Missing required parameter", result.ErrorMessage);
|
||||
}
|
||||
@@ -38,136 +62,379 @@ public class ParameterValidatorTests
|
||||
[Fact]
|
||||
public void BodyNotObject_ReturnsInvalid()
|
||||
{
|
||||
var definitions = JsonSerializer.Serialize(new[]
|
||||
{
|
||||
new { Name = "value", Type = "String", Required = true }
|
||||
});
|
||||
const string def = """{"type":"object","properties":{"value":{"type":"string"}},"required":["value"]}""";
|
||||
|
||||
using var doc = JsonDocument.Parse("\"just a string\"");
|
||||
var result = ParameterValidator.Validate(doc.RootElement.Clone(), definitions);
|
||||
var result = ParameterValidator.Validate(Body("\"just a string\""), def);
|
||||
Assert.False(result.IsValid);
|
||||
Assert.Contains("must be a JSON object", result.ErrorMessage);
|
||||
}
|
||||
|
||||
[Theory]
|
||||
[InlineData("Boolean", "true", true)]
|
||||
[InlineData("Integer", "42", (long)42)]
|
||||
[InlineData("Float", "3.14", 3.14)]
|
||||
[InlineData("String", "\"hello\"", "hello")]
|
||||
public void ValidTypeCoercion_Succeeds(string type, string jsonValue, object expected)
|
||||
{
|
||||
var definitions = JsonSerializer.Serialize(new[]
|
||||
{
|
||||
new { Name = "val", Type = type, Required = true }
|
||||
});
|
||||
|
||||
using var doc = JsonDocument.Parse($"{{\"val\": {jsonValue}}}");
|
||||
var result = ParameterValidator.Validate(doc.RootElement.Clone(), definitions);
|
||||
Assert.True(result.IsValid);
|
||||
Assert.Equal(expected, result.Parameters["val"]);
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public void WrongType_ReturnsInvalid()
|
||||
{
|
||||
var definitions = JsonSerializer.Serialize(new[]
|
||||
{
|
||||
new { Name = "count", Type = "Integer", Required = true }
|
||||
});
|
||||
|
||||
using var doc = JsonDocument.Parse("{\"count\": \"not a number\"}");
|
||||
var result = ParameterValidator.Validate(doc.RootElement.Clone(), definitions);
|
||||
Assert.False(result.IsValid);
|
||||
Assert.Contains("must be an Integer", result.ErrorMessage);
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public void ObjectType_Parsed()
|
||||
{
|
||||
var definitions = JsonSerializer.Serialize(new[]
|
||||
{
|
||||
new { Name = "data", Type = "Object", Required = true }
|
||||
});
|
||||
|
||||
using var doc = JsonDocument.Parse("{\"data\": {\"key\": \"value\"}}");
|
||||
var result = ParameterValidator.Validate(doc.RootElement.Clone(), definitions);
|
||||
Assert.True(result.IsValid);
|
||||
Assert.IsType<Dictionary<string, object?>>(result.Parameters["data"]);
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public void ListType_Parsed()
|
||||
{
|
||||
var definitions = JsonSerializer.Serialize(new[]
|
||||
{
|
||||
new { Name = "items", Type = "List", Required = true }
|
||||
});
|
||||
|
||||
using var doc = JsonDocument.Parse("{\"items\": [1, 2, 3]}");
|
||||
var result = ParameterValidator.Validate(doc.RootElement.Clone(), definitions);
|
||||
Assert.True(result.IsValid);
|
||||
Assert.IsType<List<object?>>(result.Parameters["items"]);
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public void OptionalParameter_MissingBody_ReturnsValid()
|
||||
{
|
||||
var definitions = JsonSerializer.Serialize(new[]
|
||||
{
|
||||
new { Name = "optional", Type = "String", Required = false }
|
||||
});
|
||||
const string def = """{"type":"object","properties":{"optional":{"type":"string"}}}""";
|
||||
|
||||
var result = ParameterValidator.Validate(null, definitions);
|
||||
var result = ParameterValidator.Validate(null, def);
|
||||
Assert.True(result.IsValid);
|
||||
}
|
||||
|
||||
// ── Scalar coercion ────────────────────────────────────────────────────────
|
||||
|
||||
[Theory]
|
||||
[InlineData("boolean", "true", true)]
|
||||
[InlineData("integer", "42", (long)42)]
|
||||
[InlineData("number", "3.14", 3.14)]
|
||||
[InlineData("string", "\"hello\"", "hello")]
|
||||
public void ValidTypeCoercion_Succeeds(string type, string jsonValue, object expected)
|
||||
{
|
||||
var def = "{\"type\":\"object\",\"properties\":{\"val\":{\"type\":\"" + type + "\"}},\"required\":[\"val\"]}";
|
||||
|
||||
var result = ParameterValidator.Validate(Body($"{{\"val\": {jsonValue}}}"), def);
|
||||
Assert.True(result.IsValid);
|
||||
Assert.Equal(expected, result.Parameters["val"]);
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public void WrongScalarType_ReturnsInvalid()
|
||||
{
|
||||
const string def = """{"type":"object","properties":{"count":{"type":"integer"}},"required":["count"]}""";
|
||||
|
||||
var result = ParameterValidator.Validate(Body("{\"count\": \"not a number\"}"), def);
|
||||
Assert.False(result.IsValid);
|
||||
Assert.Contains("'count'", result.ErrorMessage);
|
||||
Assert.Contains("Integer", result.ErrorMessage);
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public void UnknownType_ReturnsInvalid()
|
||||
{
|
||||
var definitions = JsonSerializer.Serialize(new[]
|
||||
{
|
||||
new { Name = "val", Type = "CustomType", Required = true }
|
||||
});
|
||||
const string def = """{"type":"object","properties":{"val":{"type":"customtype"}},"required":["val"]}""";
|
||||
|
||||
using var doc = JsonDocument.Parse("{\"val\": \"test\"}");
|
||||
var result = ParameterValidator.Validate(doc.RootElement.Clone(), definitions);
|
||||
var result = ParameterValidator.Validate(Body("{\"val\": \"test\"}"), def);
|
||||
Assert.False(result.IsValid);
|
||||
Assert.Contains("Unknown parameter type", result.ErrorMessage);
|
||||
Assert.Contains("unknown declared type", result.ErrorMessage);
|
||||
}
|
||||
|
||||
// --- InboundAPI-010: unexpected top-level body fields must be reported so
|
||||
// callers get feedback on typo'd parameter names instead of silent ignore. ---
|
||||
// ── Object / List shape + materialization ──────────────────────────────────
|
||||
|
||||
[Fact]
|
||||
public void UnexpectedBodyField_ReturnsInvalid()
|
||||
public void ObjectType_NoDeclaredFields_ShapeOnly_Materialized()
|
||||
{
|
||||
var definitions = JsonSerializer.Serialize(new[]
|
||||
{
|
||||
new { Name = "value", Type = "Integer", Required = true }
|
||||
});
|
||||
const string def = """{"type":"object","properties":{"data":{"type":"object"}},"required":["data"]}""";
|
||||
|
||||
var result = ParameterValidator.Validate(Body("{\"data\": {\"key\": \"value\"}}"), def);
|
||||
Assert.True(result.IsValid);
|
||||
Assert.IsType<Dictionary<string, object?>>(result.Parameters["data"]);
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public void ListType_NoDeclaredElement_ShapeOnly_Materialized()
|
||||
{
|
||||
const string def = """{"type":"object","properties":{"items":{"type":"array"}},"required":["items"]}""";
|
||||
|
||||
var result = ParameterValidator.Validate(Body("{\"items\": [1, 2, 3]}"), def);
|
||||
Assert.True(result.IsValid);
|
||||
Assert.IsType<List<object?>>(result.Parameters["items"]);
|
||||
}
|
||||
|
||||
// ── Undeclared / unexpected fields (rejected, recursively) ─────────────────
|
||||
|
||||
[Fact]
|
||||
public void UnexpectedTopLevelField_ReturnsInvalid()
|
||||
{
|
||||
const string def = """{"type":"object","properties":{"value":{"type":"integer"}},"required":["value"]}""";
|
||||
|
||||
// "valeu" is a typo for "value"; the caller must be told, not ignored.
|
||||
using var doc = JsonDocument.Parse("{\"value\": 1, \"valeu\": 2}");
|
||||
var result = ParameterValidator.Validate(doc.RootElement.Clone(), definitions);
|
||||
var result = ParameterValidator.Validate(Body("{\"value\": 1, \"valeu\": 2}"), def);
|
||||
|
||||
Assert.False(result.IsValid);
|
||||
Assert.Contains("valeu", result.ErrorMessage);
|
||||
Assert.Contains("not a declared field", result.ErrorMessage);
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public void OnlyDefinedFields_StillValid()
|
||||
public void OnlyDeclaredFields_StillValid()
|
||||
{
|
||||
// Regression guard: a body containing exactly the defined parameters
|
||||
// must continue to validate.
|
||||
var definitions = JsonSerializer.Serialize(new[]
|
||||
{
|
||||
new { Name = "value", Type = "Integer", Required = true }
|
||||
});
|
||||
const string def = """{"type":"object","properties":{"value":{"type":"integer"}},"required":["value"]}""";
|
||||
|
||||
using var doc = JsonDocument.Parse("{\"value\": 1}");
|
||||
var result = ParameterValidator.Validate(doc.RootElement.Clone(), definitions);
|
||||
var result = ParameterValidator.Validate(Body("{\"value\": 1}"), def);
|
||||
|
||||
Assert.True(result.IsValid);
|
||||
Assert.Equal((long)1, result.Parameters["value"]);
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public void UndeclaredNestedField_ReturnsInvalid_PathQualified()
|
||||
{
|
||||
const string def = """
|
||||
{"type":"object","properties":{
|
||||
"order":{"type":"object","properties":{"id":{"type":"integer"}},"required":["id"]}
|
||||
},"required":["order"]}
|
||||
""";
|
||||
|
||||
var result = ParameterValidator.Validate(
|
||||
Body("""{"order":{"id":1,"bogus":2}}"""), def);
|
||||
|
||||
Assert.False(result.IsValid);
|
||||
Assert.Contains("order.bogus", result.ErrorMessage);
|
||||
Assert.Contains("not a declared field", result.ErrorMessage);
|
||||
}
|
||||
|
||||
// ── Nested validation: the M2.6 core ───────────────────────────────────────
|
||||
|
||||
private const string NestedDef = """
|
||||
{
|
||||
"type":"object",
|
||||
"properties":{
|
||||
"order":{
|
||||
"type":"object",
|
||||
"properties":{
|
||||
"id":{"type":"integer"},
|
||||
"customer":{
|
||||
"type":"object",
|
||||
"properties":{"name":{"type":"string"},"vip":{"type":"boolean"}},
|
||||
"required":["name"]
|
||||
},
|
||||
"items":{
|
||||
"type":"array",
|
||||
"items":{
|
||||
"type":"object",
|
||||
"properties":{"sku":{"type":"string"},"quantity":{"type":"integer"}},
|
||||
"required":["sku","quantity"]
|
||||
}
|
||||
}
|
||||
},
|
||||
"required":["id","customer","items"]
|
||||
}
|
||||
},
|
||||
"required":["order"]
|
||||
}
|
||||
""";
|
||||
|
||||
[Fact]
|
||||
public void ValidNestedPayload_Passes()
|
||||
{
|
||||
const string body = """
|
||||
{"order":{
|
||||
"id":7,
|
||||
"customer":{"name":"Acme","vip":true},
|
||||
"items":[
|
||||
{"sku":"A1","quantity":3},
|
||||
{"sku":"B2","quantity":1}
|
||||
]
|
||||
}}
|
||||
""";
|
||||
|
||||
var result = ParameterValidator.Validate(Body(body), NestedDef);
|
||||
Assert.True(result.IsValid);
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public void WrongScalarTwoLevelsDeep_ReturnsInvalid_WithExactPath()
|
||||
{
|
||||
// order.customer.vip declared boolean, given a string.
|
||||
const string body = """
|
||||
{"order":{
|
||||
"id":7,
|
||||
"customer":{"name":"Acme","vip":"yes"},
|
||||
"items":[]
|
||||
}}
|
||||
""";
|
||||
|
||||
var result = ParameterValidator.Validate(Body(body), NestedDef);
|
||||
Assert.False(result.IsValid);
|
||||
Assert.Contains("'order.customer.vip'", result.ErrorMessage);
|
||||
Assert.Contains("Boolean", result.ErrorMessage);
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public void WrongScalarInsideListElement_ReturnsInvalid_WithElementIndexInPath()
|
||||
{
|
||||
// order.items[1].quantity declared integer, given a string.
|
||||
const string body = """
|
||||
{"order":{
|
||||
"id":7,
|
||||
"customer":{"name":"Acme"},
|
||||
"items":[
|
||||
{"sku":"A1","quantity":3},
|
||||
{"sku":"B2","quantity":"lots"}
|
||||
]
|
||||
}}
|
||||
""";
|
||||
|
||||
var result = ParameterValidator.Validate(Body(body), NestedDef);
|
||||
Assert.False(result.IsValid);
|
||||
Assert.Contains("'order.items[1].quantity'", result.ErrorMessage);
|
||||
Assert.Contains("Integer", result.ErrorMessage);
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public void ListElementWrongShape_ReturnsInvalid_WithElementIndexInPath()
|
||||
{
|
||||
// order.items[0] declared object, given a scalar.
|
||||
const string body = """
|
||||
{"order":{
|
||||
"id":7,
|
||||
"customer":{"name":"Acme"},
|
||||
"items":[ 42 ]
|
||||
}}
|
||||
""";
|
||||
|
||||
var result = ParameterValidator.Validate(Body(body), NestedDef);
|
||||
Assert.False(result.IsValid);
|
||||
Assert.Contains("'order.items[0]'", result.ErrorMessage);
|
||||
Assert.Contains("Object", result.ErrorMessage);
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public void MissingRequiredNestedField_ReturnsInvalid_PathQualified()
|
||||
{
|
||||
// order.customer.name is required but absent.
|
||||
const string body = """
|
||||
{"order":{
|
||||
"id":7,
|
||||
"customer":{"vip":false},
|
||||
"items":[]
|
||||
}}
|
||||
""";
|
||||
|
||||
var result = ParameterValidator.Validate(Body(body), NestedDef);
|
||||
Assert.False(result.IsValid);
|
||||
Assert.Contains("missing required field", result.ErrorMessage);
|
||||
Assert.Contains("'order.customer.name'", result.ErrorMessage);
|
||||
}
|
||||
|
||||
// ── Empty / null edge cases ────────────────────────────────────────────────
|
||||
|
||||
[Fact]
|
||||
public void EmptyList_AgainstTypedElement_Passes()
|
||||
{
|
||||
const string body = """
|
||||
{"order":{"id":7,"customer":{"name":"Acme"},"items":[]}}
|
||||
""";
|
||||
|
||||
var result = ParameterValidator.Validate(Body(body), NestedDef);
|
||||
Assert.True(result.IsValid);
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public void NullForOptionalNestedScalar_Passes()
|
||||
{
|
||||
// order.customer.vip is optional; explicit null is accepted.
|
||||
const string body = """
|
||||
{"order":{
|
||||
"id":7,
|
||||
"customer":{"name":"Acme","vip":null},
|
||||
"items":[]
|
||||
}}
|
||||
""";
|
||||
|
||||
var result = ParameterValidator.Validate(Body(body), NestedDef);
|
||||
Assert.True(result.IsValid);
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public void NullForRequiredNestedScalar_Passes()
|
||||
{
|
||||
// A PRESENT-but-null required field satisfies the type — only ABSENCE
|
||||
// of a required field is an error (consistent with return-side policy).
|
||||
const string body = """
|
||||
{"order":{
|
||||
"id":null,
|
||||
"customer":{"name":"Acme"},
|
||||
"items":[]
|
||||
}}
|
||||
""";
|
||||
|
||||
var result = ParameterValidator.Validate(Body(body), NestedDef);
|
||||
Assert.True(result.IsValid);
|
||||
}
|
||||
|
||||
// ── Legacy flat-array backward-compat ──────────────────────────────────────
|
||||
|
||||
[Fact]
|
||||
public void LegacyFlatArrayDefinition_StillAccepted()
|
||||
{
|
||||
const string def = """[{"name":"count","type":"Integer","required":true}]""";
|
||||
|
||||
var ok = ParameterValidator.Validate(Body("{\"count\":5}"), def);
|
||||
Assert.True(ok.IsValid);
|
||||
Assert.Equal((long)5, ok.Parameters["count"]);
|
||||
|
||||
var bad = ParameterValidator.Validate(Body("{\"count\":\"nope\"}"), def);
|
||||
Assert.False(bad.IsValid);
|
||||
Assert.Contains("'count'", bad.ErrorMessage);
|
||||
}
|
||||
|
||||
// FIX 1: legacy "required":"false" string → field is optional ─────────────
|
||||
|
||||
[Theory]
|
||||
[InlineData("""[{"name":"opt","type":"String","required":"false"}]""")]
|
||||
[InlineData("""[{"name":"opt","type":"String","required":"False"}]""")]
|
||||
[InlineData("""[{"name":"opt","type":"String","required":"FALSE"}]""")]
|
||||
public void LegacyFlatArray_RequiredStringFalse_FieldIsOptional(string def)
|
||||
{
|
||||
// An absent field whose "required" is the string "false" (any case)
|
||||
// must be treated as optional — consistent with the SQL migration's
|
||||
// LOWER(...) <> 'false' comparison that produced these rows.
|
||||
var result = ParameterValidator.Validate(null, def);
|
||||
Assert.True(result.IsValid, $"Expected optional field to be valid when absent; error: {result.ErrorMessage}");
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public void LegacyFlatArray_RequiredStringFalse_FieldPresentAndTypedCorrectly_Passes()
|
||||
{
|
||||
const string def = """[{"name":"opt","type":"String","required":"false"}]""";
|
||||
|
||||
var result = ParameterValidator.Validate(Body("{\"opt\":\"hello\"}"), def);
|
||||
Assert.True(result.IsValid);
|
||||
}
|
||||
|
||||
// FIX 2: recursion depth guard on Parse ───────────────────────────────────
|
||||
|
||||
/// <summary>
|
||||
/// Builds a JSON Schema string with <paramref name="depth"/> levels of nested
|
||||
/// object-in-properties nesting. Each level wraps the previous in an object
|
||||
/// with a single property "a". The result exceeds the Parse ceiling when
|
||||
/// depth > 32.
|
||||
/// </summary>
|
||||
private static string BuildDeeplyNestedSchema(int depth)
|
||||
{
|
||||
// Inner-most: a scalar
|
||||
var schema = "{\"type\":\"string\"}";
|
||||
for (var i = 0; i < depth; i++)
|
||||
{
|
||||
schema = "{\"type\":\"object\",\"properties\":{\"a\":" + schema + "}}";
|
||||
}
|
||||
return schema;
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public void SchemaAtDepthCeiling_ParsesSuccessfully()
|
||||
{
|
||||
// Exactly 32 levels of nesting should parse without throwing.
|
||||
var def = BuildDeeplyNestedSchema(32);
|
||||
var schema = InboundApiSchema.Parse(def);
|
||||
Assert.NotNull(schema);
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public void SchemaExceedingDepthCeiling_ThrowsJsonException_NotStackOverflow()
|
||||
{
|
||||
// 33 levels exceeds the ceiling → JsonException (clean 400 via the
|
||||
// caller's try/catch), NOT a StackOverflowException.
|
||||
var def = BuildDeeplyNestedSchema(33);
|
||||
Assert.Throws<System.Text.Json.JsonException>(() => InboundApiSchema.Parse(def));
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public void SchemaExceedingDepthCeiling_ParameterValidator_ReturnsInvalid()
|
||||
{
|
||||
// End-to-end: ParameterValidator wraps Parse in try/catch(JsonException)
|
||||
// → the caller gets Invalid rather than an unhandled exception.
|
||||
var def = BuildDeeplyNestedSchema(33);
|
||||
var result = ParameterValidator.Validate(Body("{\"a\":\"x\"}"), def);
|
||||
Assert.False(result.IsValid);
|
||||
Assert.Contains("Invalid parameter definitions", result.ErrorMessage);
|
||||
}
|
||||
}
|
||||
|
||||
@@ -1,13 +1,21 @@
|
||||
using ZB.MOM.WW.ScadaBridge.Commons.Types.InboundApi;
|
||||
|
||||
namespace ZB.MOM.WW.ScadaBridge.InboundAPI.Tests;
|
||||
|
||||
/// <summary>
|
||||
/// InboundAPI-014: tests for return-value validation against a method's
|
||||
/// <c>ReturnDefinition</c>. Previously the script's return value was serialized
|
||||
/// verbatim with no checking against the declared return structure.
|
||||
/// InboundAPI-014 / InboundAPI-M2.6: tests for return-value validation against a
|
||||
/// method's <c>ReturnDefinition</c>. Mirrors <see cref="ParameterValidatorTests"/>
|
||||
/// (shared recursive engine) — RECURSIVE nested Object / List-element type
|
||||
/// validation with path-qualified errors.
|
||||
///
|
||||
/// <para>
|
||||
/// Definitions are expressed as JSON Schema (the canonical persisted format);
|
||||
/// the legacy flat-array form is still accepted (final region).
|
||||
/// </para>
|
||||
/// </summary>
|
||||
public class ReturnValueValidatorTests
|
||||
{
|
||||
// --- No definition → no validation (backward compatible) ---
|
||||
// ── No definition → no validation (backward compatible) ───────────────────
|
||||
|
||||
[Theory]
|
||||
[InlineData(null)]
|
||||
@@ -26,12 +34,17 @@ public class ReturnValueValidatorTests
|
||||
Assert.True(result.IsValid);
|
||||
}
|
||||
|
||||
// --- Happy path: result matches the declared field shape ---
|
||||
// ── Happy path: result matches the declared object shape ──────────────────
|
||||
|
||||
[Fact]
|
||||
public void ResultMatchingDefinition_IsValid()
|
||||
{
|
||||
const string def = """[{"name":"siteName","type":"String"},{"name":"totalUnits","type":"Integer"}]""";
|
||||
const string def = """
|
||||
{"type":"object","properties":{
|
||||
"siteName":{"type":"string"},
|
||||
"totalUnits":{"type":"integer"}
|
||||
},"required":["siteName","totalUnits"]}
|
||||
""";
|
||||
const string json = """{"siteName":"Site Alpha","totalUnits":14250}""";
|
||||
|
||||
var result = ReturnValueValidator.Validate(json, def);
|
||||
@@ -40,22 +53,31 @@ public class ReturnValueValidatorTests
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public void ResultWithListField_ShapeChecked_IsValid()
|
||||
public void ResultWithListOfScalars_TypeChecked_IsValid()
|
||||
{
|
||||
const string def = """[{"name":"lines","type":"List"}]""";
|
||||
const string json = """{"lines":[{"lineName":"Line-1","units":8200}]}""";
|
||||
const string def = """
|
||||
{"type":"object","properties":{
|
||||
"codes":{"type":"array","items":{"type":"integer"}}
|
||||
}}
|
||||
""";
|
||||
const string json = """{"codes":[1,2,3]}""";
|
||||
|
||||
var result = ReturnValueValidator.Validate(json, def);
|
||||
|
||||
Assert.True(result.IsValid);
|
||||
}
|
||||
|
||||
// --- Mismatches must be reported ---
|
||||
// ── Scalar / shape mismatches must be reported ────────────────────────────
|
||||
|
||||
[Fact]
|
||||
public void ResultMissingDeclaredField_IsInvalid()
|
||||
{
|
||||
const string def = """[{"name":"siteName","type":"String"},{"name":"totalUnits","type":"Integer"}]""";
|
||||
const string def = """
|
||||
{"type":"object","properties":{
|
||||
"siteName":{"type":"string"},
|
||||
"totalUnits":{"type":"integer"}
|
||||
},"required":["siteName","totalUnits"]}
|
||||
""";
|
||||
const string json = """{"siteName":"Site Alpha"}""";
|
||||
|
||||
var result = ReturnValueValidator.Validate(json, def);
|
||||
@@ -67,7 +89,7 @@ public class ReturnValueValidatorTests
|
||||
[Fact]
|
||||
public void ResultFieldWrongType_IsInvalid()
|
||||
{
|
||||
const string def = """[{"name":"totalUnits","type":"Integer"}]""";
|
||||
const string def = """{"type":"object","properties":{"totalUnits":{"type":"integer"}},"required":["totalUnits"]}""";
|
||||
const string json = """{"totalUnits":"not-a-number"}""";
|
||||
|
||||
var result = ReturnValueValidator.Validate(json, def);
|
||||
@@ -79,7 +101,7 @@ public class ReturnValueValidatorTests
|
||||
[Fact]
|
||||
public void NullResultWhenStructureRequired_IsInvalid()
|
||||
{
|
||||
const string def = """[{"name":"siteName","type":"String"}]""";
|
||||
const string def = """{"type":"object","properties":{"siteName":{"type":"string"}},"required":["siteName"]}""";
|
||||
|
||||
var result = ReturnValueValidator.Validate(null, def);
|
||||
|
||||
@@ -89,7 +111,7 @@ public class ReturnValueValidatorTests
|
||||
[Fact]
|
||||
public void NonObjectResultWhenStructureRequired_IsInvalid()
|
||||
{
|
||||
const string def = """[{"name":"siteName","type":"String"}]""";
|
||||
const string def = """{"type":"object","properties":{"siteName":{"type":"string"}},"required":["siteName"]}""";
|
||||
|
||||
var result = ReturnValueValidator.Validate("42", def);
|
||||
|
||||
@@ -99,7 +121,7 @@ public class ReturnValueValidatorTests
|
||||
[Fact]
|
||||
public void ListFieldGivenNonArray_IsInvalid()
|
||||
{
|
||||
const string def = """[{"name":"lines","type":"List"}]""";
|
||||
const string def = """{"type":"object","properties":{"lines":{"type":"array","items":{"type":"object"}}}}""";
|
||||
const string json = """{"lines":"not-a-list"}""";
|
||||
|
||||
var result = ReturnValueValidator.Validate(json, def);
|
||||
@@ -115,4 +137,261 @@ public class ReturnValueValidatorTests
|
||||
|
||||
Assert.False(result.IsValid);
|
||||
}
|
||||
|
||||
// ── Nested validation: the M2.6 core (production-report shape) ─────────────
|
||||
|
||||
private const string ReportDef = """
|
||||
{
|
||||
"type":"object",
|
||||
"properties":{
|
||||
"siteName":{"type":"string"},
|
||||
"totalUnits":{"type":"integer"},
|
||||
"lines":{
|
||||
"type":"array",
|
||||
"items":{
|
||||
"type":"object",
|
||||
"properties":{
|
||||
"lineName":{"type":"string"},
|
||||
"units":{"type":"integer"},
|
||||
"efficiency":{"type":"number"}
|
||||
},
|
||||
"required":["lineName","units"]
|
||||
}
|
||||
}
|
||||
},
|
||||
"required":["siteName","totalUnits","lines"]
|
||||
}
|
||||
""";
|
||||
|
||||
[Fact]
|
||||
public void ValidNestedReturn_Passes()
|
||||
{
|
||||
const string json = """
|
||||
{
|
||||
"siteName":"Site Alpha",
|
||||
"totalUnits":14250,
|
||||
"lines":[
|
||||
{"lineName":"Line-1","units":8200,"efficiency":92.5},
|
||||
{"lineName":"Line-2","units":6050,"efficiency":88.1}
|
||||
]
|
||||
}
|
||||
""";
|
||||
|
||||
var result = ReturnValueValidator.Validate(json, ReportDef);
|
||||
Assert.True(result.IsValid);
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public void WrongScalarInsideListElement_IsInvalid_WithElementIndexInPath()
|
||||
{
|
||||
// lines[1].units declared integer, given a string.
|
||||
const string json = """
|
||||
{
|
||||
"siteName":"Site Alpha",
|
||||
"totalUnits":14250,
|
||||
"lines":[
|
||||
{"lineName":"Line-1","units":8200},
|
||||
{"lineName":"Line-2","units":"lots"}
|
||||
]
|
||||
}
|
||||
""";
|
||||
|
||||
var result = ReturnValueValidator.Validate(json, ReportDef);
|
||||
Assert.False(result.IsValid);
|
||||
Assert.Contains("'lines[1].units'", result.ErrorMessage);
|
||||
Assert.Contains("Integer", result.ErrorMessage);
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public void WrongListElementType_IsInvalid_WithElementIndexInPath()
|
||||
{
|
||||
// lines[0] declared object, given a scalar.
|
||||
const string json = """
|
||||
{
|
||||
"siteName":"Site Alpha",
|
||||
"totalUnits":14250,
|
||||
"lines":[ 7 ]
|
||||
}
|
||||
""";
|
||||
|
||||
var result = ReturnValueValidator.Validate(json, ReportDef);
|
||||
Assert.False(result.IsValid);
|
||||
Assert.Contains("'lines[0]'", result.ErrorMessage);
|
||||
Assert.Contains("Object", result.ErrorMessage);
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public void MissingRequiredNestedField_IsInvalid_PathQualified()
|
||||
{
|
||||
// lines[0].units is required but absent.
|
||||
const string json = """
|
||||
{
|
||||
"siteName":"Site Alpha",
|
||||
"totalUnits":14250,
|
||||
"lines":[ {"lineName":"Line-1"} ]
|
||||
}
|
||||
""";
|
||||
|
||||
var result = ReturnValueValidator.Validate(json, ReportDef);
|
||||
Assert.False(result.IsValid);
|
||||
Assert.Contains("missing required field", result.ErrorMessage);
|
||||
Assert.Contains("'lines[0].units'", result.ErrorMessage);
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public void UndeclaredNestedField_IsInvalid_PathQualified()
|
||||
{
|
||||
// lines[0].bogus is not declared on the line-item schema.
|
||||
const string json = """
|
||||
{
|
||||
"siteName":"Site Alpha",
|
||||
"totalUnits":14250,
|
||||
"lines":[ {"lineName":"Line-1","units":1,"bogus":true} ]
|
||||
}
|
||||
""";
|
||||
|
||||
var result = ReturnValueValidator.Validate(json, ReportDef);
|
||||
Assert.False(result.IsValid);
|
||||
Assert.Contains("'lines[0].bogus'", result.ErrorMessage);
|
||||
Assert.Contains("not a declared field", result.ErrorMessage);
|
||||
}
|
||||
|
||||
// ── Empty / null edge cases ────────────────────────────────────────────────
|
||||
|
||||
[Fact]
|
||||
public void EmptyListAgainstTypedElement_Passes()
|
||||
{
|
||||
const string json = """{"siteName":"S","totalUnits":0,"lines":[]}""";
|
||||
|
||||
var result = ReturnValueValidator.Validate(json, ReportDef);
|
||||
Assert.True(result.IsValid);
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public void EmptyObjectSchema_AnythingIsValid()
|
||||
{
|
||||
const string def = """{"type":"object","properties":{}}""";
|
||||
|
||||
var result = ReturnValueValidator.Validate("""{"whatever":1}""", def);
|
||||
Assert.True(result.IsValid);
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public void NullOptionalNestedScalar_Passes()
|
||||
{
|
||||
// lines[0].efficiency is optional; explicit null is accepted.
|
||||
const string json = """
|
||||
{
|
||||
"siteName":"S",
|
||||
"totalUnits":1,
|
||||
"lines":[ {"lineName":"L1","units":1,"efficiency":null} ]
|
||||
}
|
||||
""";
|
||||
|
||||
var result = ReturnValueValidator.Validate(json, ReportDef);
|
||||
Assert.True(result.IsValid);
|
||||
}
|
||||
|
||||
// ── Legacy flat-array backward-compat ──────────────────────────────────────
|
||||
|
||||
[Fact]
|
||||
public void LegacyFlatArrayDefinition_StillAccepted()
|
||||
{
|
||||
const string def = """[{"name":"siteName","type":"String"},{"name":"totalUnits","type":"Integer"}]""";
|
||||
|
||||
var ok = ReturnValueValidator.Validate("""{"siteName":"A","totalUnits":1}""", def);
|
||||
Assert.True(ok.IsValid);
|
||||
|
||||
var bad = ReturnValueValidator.Validate("""{"siteName":"A","totalUnits":"x"}""", def);
|
||||
Assert.False(bad.IsValid);
|
||||
Assert.Contains("totalUnits", bad.ErrorMessage);
|
||||
}
|
||||
|
||||
// FIX 3: scalar return schema validates scalar return values ──────────────
|
||||
// (Guards the intentional ParameterValidator/ReturnValueValidator asymmetry:
|
||||
// ReturnValueValidator must NOT short-circuit on non-object schema types.)
|
||||
|
||||
[Fact]
|
||||
public void ScalarStringReturnSchema_ValidatesScalarStringReturn()
|
||||
{
|
||||
// A {"type":"string"} return schema must accept a bare JSON string.
|
||||
var result = ReturnValueValidator.Validate("\"hello\"", """{"type":"string"}""");
|
||||
Assert.True(result.IsValid);
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public void ScalarIntegerReturnSchema_ValidatesScalarIntegerReturn()
|
||||
{
|
||||
var result = ReturnValueValidator.Validate("42", """{"type":"integer"}""");
|
||||
Assert.True(result.IsValid);
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public void ScalarStringReturnSchema_RejectsIntegerReturn()
|
||||
{
|
||||
var result = ReturnValueValidator.Validate("42", """{"type":"string"}""");
|
||||
Assert.False(result.IsValid);
|
||||
Assert.Contains("String", result.ErrorMessage);
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public void ScalarBooleanReturnSchema_ValidatesBooleanReturn()
|
||||
{
|
||||
var result = ReturnValueValidator.Validate("true", """{"type":"boolean"}""");
|
||||
Assert.True(result.IsValid);
|
||||
}
|
||||
|
||||
// FIX 2: recursion depth guard on Validate ─────────────────────────────────
|
||||
|
||||
[Fact]
|
||||
public void ValidateExceedingDepthCeiling_AddsDepthError_DoesNotThrow()
|
||||
{
|
||||
// Build a schema programmatically (bypassing Parse) with 34 levels of
|
||||
// nesting to exceed the ceiling of 32. Validate must add an error and
|
||||
// return, NOT stack overflow.
|
||||
//
|
||||
// Parse prevents creating a >32-level schema from stored JSON, but
|
||||
// InboundApiSchema is a public type constructable in code, so Validate
|
||||
// must guard independently.
|
||||
var deepSchema = BuildProgrammaticSchema(34);
|
||||
|
||||
var json = BuildDeeplyNestedValue(34);
|
||||
using var doc = System.Text.Json.JsonDocument.Parse(json);
|
||||
|
||||
var errors = new System.Collections.Generic.List<string>();
|
||||
// Must not throw — adds a depth error to the list instead.
|
||||
deepSchema.Validate(doc.RootElement, string.Empty, errors);
|
||||
|
||||
Assert.NotEmpty(errors);
|
||||
Assert.Contains("nesting too deep", errors[0], StringComparison.OrdinalIgnoreCase);
|
||||
}
|
||||
|
||||
/// <summary>
|
||||
/// Constructs an <see cref="InboundApiSchema"/> with <paramref name="depth"/>
|
||||
/// levels of object-nesting programmatically (bypassing <c>Parse</c>) to
|
||||
/// exercise the Validate depth ceiling independently of the Parse ceiling.
|
||||
/// </summary>
|
||||
private static InboundApiSchema BuildProgrammaticSchema(int depth)
|
||||
{
|
||||
InboundApiSchema inner = new() { Type = "string" };
|
||||
for (var i = 0; i < depth; i++)
|
||||
{
|
||||
inner = new InboundApiSchema
|
||||
{
|
||||
Type = "object",
|
||||
Fields = [new InboundApiSchemaField("a", Required: false, inner)],
|
||||
};
|
||||
}
|
||||
return inner;
|
||||
}
|
||||
|
||||
private static string BuildDeeplyNestedValue(int depth)
|
||||
{
|
||||
var value = "\"leaf\"";
|
||||
for (var i = 0; i < depth; i++)
|
||||
{
|
||||
value = "{\"a\":" + value + "}";
|
||||
}
|
||||
return value;
|
||||
}
|
||||
}
|
||||
|
||||
Some files were not shown because too many files have changed in this diff Show More
Reference in New Issue
Block a user