Files
scadaproj/docs/plans/2026-06-23-historian-gateway-implementation.md
T

524 lines
35 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# ZB.MOM.WW.HistorianGateway Implementation Plan
> **For Claude:** REQUIRED SUB-SKILL: Use superpowers-extended-cc:executing-plans to implement this plan task-by-task.
**Goal:** Build a single .NET 10 x64 sidecar that exposes (1) a read-only Galaxy object-hierarchy metadata gRPC server and (2) a full read/write gRPC API to the AVEVA Historian, with a Blazor dashboard, reusing the family's shared `ZB.MOM.WW.*` packages.
**Architecture:** One ASP.NET Core process hosting gRPC services + Blazor (no COM, no x86 worker). The historian write/read surface comes from the **vendored `histsdk` client** (`AVEVA.Historian.Client`). The Galaxy browse comes from a **new shared lib `ZB.MOM.WW.GalaxyRepository`** in scadaproj (extracted from mxaccessgw, wire-compatible `galaxy_repository.v1`). Connection model: stateless gateway over a **pooled, pre-authenticated service-identity connection**; clients authenticate to the gateway via peppered-HMAC API keys with per-service scopes.
**Tech Stack:** .NET 10, ASP.NET Core, Grpc.AspNetCore 2.76, Grpc.Net.Client 2.58 (vendored), Google.Protobuf, Microsoft.Data.SqlClient, Microsoft.Data.Sqlite, Blazor InteractiveServer, `ZB.MOM.WW.Theme` 0.3.1, `ZB.MOM.WW.Auth` 0.1.2, `ZB.MOM.WW.Telemetry`/`.Serilog` 0.1.0, `ZB.MOM.WW.Health` 0.1.0, `ZB.MOM.WW.Audit` 0.1.0, `ZB.MOM.WW.Configuration` 0.1.0, xUnit, bUnit.
**Reference sources (read these for exact patterns — do NOT re-discover):**
- Design doc: `docs/plans/2026-06-23-historian-gateway-design.md`
- mxaccessgw (the model): `~/Desktop/MxAccessGateway/src/ZB.MOM.WW.MxGateway.Server/``GatewayApplication.cs` (host wiring), `Security/Authorization/*` (gRPC API-key interceptor + scope resolver), `Galaxy/GalaxyRepository.cs` (the SQL to extract), `Galaxy/GalaxyRepositoryOptions.cs`, `Galaxy/GalaxyHierarchyCache.cs`, `Galaxy/GalaxyRepositoryServiceCollectionExtensions.cs`, `Contracts/Protos/galaxy_repository.proto`, `Dashboard/Components/*` (Blazor + Theme).
- histsdk clone (to vendor): `/tmp/histsdk-explore/src/AVEVA.Historian.Client/` + `/tmp/histsdk-explore/tests/AVEVA.Historian.Client.Tests/`.
- Shared package signatures: captured in the design session; key paths under `~/Desktop/scadaproj/ZB.MOM.WW.{Telemetry,Health,Configuration,Audit,Auth,Theme}/`.
**Conventions for every task:** TDD where a seam exists (write the failing test first). Exact file paths in the `Files:` block ARE the implementer's contract. Commit after each task. Tests must stay green on macOS with no live historian/SQL (live tests are env-gated and skip when env vars are absent).
---
## Phase 0 — Shared `ZB.MOM.WW.GalaxyRepository` lib (in scadaproj)
> Built in `~/Desktop/scadaproj/ZB.MOM.WW.GalaxyRepository/` as plain files (NOT a nested git repo — see memory `shared-libs-are-plain-files-not-nested-repos`). Wire-compatible: keep proto `package galaxy_repository.v1` and all field numbers identical to mxaccessgw's so OtOpcUa is unaffected; only the C# `csharp_namespace` becomes neutral. mxaccessgw adoption of this lib is a separate follow-on, NOT in this plan.
### Task 1: Scaffold the GalaxyRepository lib project
**Classification:** small
**Estimated implement time:** ~3 min
**Parallelizable with:** Task 7 (vendoring), Task 9 (repo scaffold)
**Files:**
- Create: `~/Desktop/scadaproj/ZB.MOM.WW.GalaxyRepository/ZB.MOM.WW.GalaxyRepository.slnx`
- Create: `~/Desktop/scadaproj/ZB.MOM.WW.GalaxyRepository/src/ZB.MOM.WW.GalaxyRepository/ZB.MOM.WW.GalaxyRepository.csproj`
- Create: `~/Desktop/scadaproj/ZB.MOM.WW.GalaxyRepository/tests/ZB.MOM.WW.GalaxyRepository.Tests/ZB.MOM.WW.GalaxyRepository.Tests.csproj`
**Steps:**
1. Create the `.csproj` (net10.0, `Nullable`/`ImplicitUsings` enabled, packable, `PackageId=ZB.MOM.WW.GalaxyRepository`, `Version=0.1.0`). PackageReferences: `Microsoft.Data.SqlClient` 6.0.2, `Grpc.AspNetCore` 2.76.0, `Google.Protobuf`, `Microsoft.Extensions.Hosting.Abstractions`, `Microsoft.Extensions.Options.ConfigurationExtensions`. Add `<Protobuf Include="Protos\*.proto" GrpcServices="Server" />`.
2. Create the test `.csproj` (net10.0, `IsPackable=false`, xUnit 2.9.3 + `Microsoft.NET.Test.Sdk` 17.14.1 + `Microsoft.Data.SqlClient`), ProjectReference to the lib.
3. Create the `.slnx` listing both projects.
4. Run: `dotnet build ~/Desktop/scadaproj/ZB.MOM.WW.GalaxyRepository/ZB.MOM.WW.GalaxyRepository.slnx` — Expected: builds (no sources yet, 0 warnings).
5. Commit: `git -C ~/Desktop/scadaproj add ZB.MOM.WW.GalaxyRepository && git -C ~/Desktop/scadaproj commit -m "feat(galaxyrepo): scaffold ZB.MOM.WW.GalaxyRepository shared lib"`
### Task 2: Port the canonical galaxy_repository.proto (neutral namespace)
**Classification:** standard
**Estimated implement time:** ~4 min
**Parallelizable with:** none (Task 3+ depend on generated types)
**Files:**
- Create: `~/Desktop/scadaproj/ZB.MOM.WW.GalaxyRepository/src/ZB.MOM.WW.GalaxyRepository/Protos/galaxy_repository.proto`
**Steps:**
1. Copy mxaccessgw's `Contracts/Protos/galaxy_repository.proto` verbatim, changing ONLY `option csharp_namespace` to `"ZB.MOM.WW.GalaxyRepository.Grpc"`. Keep `package galaxy_repository.v1`, all services (`TestConnection`, `GetLastDeployTime`, `DiscoverHierarchy`, `WatchDeployEvents`, `BrowseChildren`), and every message/field number identical (wire compatibility).
2. Run: `dotnet build .../ZB.MOM.WW.GalaxyRepository.slnx` — Expected: PASS; generated `GalaxyRepository.GalaxyRepositoryBase`, `GalaxyObject`, `GalaxyAttribute`, etc. appear under namespace `ZB.MOM.WW.GalaxyRepository.Grpc`.
3. Commit: `feat(galaxyrepo): canonical galaxy_repository.v1 proto (neutral namespace)`
### Task 3: Port the SQL browse provider (`GalaxyRepository` + rows + options)
**Classification:** standard
**Estimated implement time:** ~5 min
**Parallelizable with:** none
**Files:**
- Create: `.../src/ZB.MOM.WW.GalaxyRepository/GalaxyRepositoryOptions.cs`
- Create: `.../src/ZB.MOM.WW.GalaxyRepository/GalaxyHierarchyRow.cs`
- Create: `.../src/ZB.MOM.WW.GalaxyRepository/GalaxyAttributeRow.cs`
- Create: `.../src/ZB.MOM.WW.GalaxyRepository/IGalaxyRepository.cs`
- Create: `.../src/ZB.MOM.WW.GalaxyRepository/GalaxyRepository.cs`
**Steps:**
1. Port `GalaxyRepositoryOptions` from mxaccessgw `Galaxy/GalaxyRepositoryOptions.cs` — rename section const to `ZB.MOM.WW.GalaxyRepository` (the consuming app picks its own section path at registration), drop MxGateway-specific defaults. Keep `ConnectionString`, `CommandTimeoutSeconds`, `DashboardRefreshIntervalSeconds`, `PersistSnapshot`, `SnapshotCachePath`.
2. Port `GalaxyHierarchyRow` / `GalaxyAttributeRow` DTOs and the `IGalaxyRepository` interface (`TestConnectionAsync`, `GetLastDeployTimeAsync`, `GetHierarchyAsync`, `GetAttributesAsync`).
3. Port `GalaxyRepository.cs` **verbatim** including the two SQL blocks (`HierarchySql`, `AttributesSql`) and the `SqlConnection`/`SqlDataReader` mapping loops — these are validated reverse-engineered queries; do NOT modify the SQL.
4. Run: `dotnet build` — Expected: PASS.
5. Commit: `feat(galaxyrepo): SQL browse provider (hierarchy + attributes)`
### Task 4: Port the in-memory hierarchy cache + snapshot + deploy notifier + refresh service
**Classification:** standard
**Estimated implement time:** ~5 min
**Parallelizable with:** none
**Files:**
- Create: `.../GalaxyHierarchyCacheEntry.cs`, `.../IGalaxyHierarchyCache.cs`, `.../GalaxyHierarchyCache.cs`
- Create: `.../IGalaxyDeployNotifier.cs`, `.../GalaxyDeployNotifier.cs`
- Create: `.../IGalaxyHierarchySnapshotStore.cs`, `.../GalaxyHierarchySnapshotStore.cs`
- Create: `.../GalaxyHierarchyRefreshService.cs` (`BackgroundService`)
- Create: `.../GalaxyHierarchyProjector.cs` (paging/filter projection used by the gRPC service)
**Steps:**
1. Port these from mxaccessgw's `Galaxy/` folder, adjusting namespaces to `ZB.MOM.WW.GalaxyRepository`. Keep the cache's first-load gate, refresh semaphore, snapshot restore, and deploy-poll refresh trigger.
2. Port `GalaxyHierarchyProjector` (the `Project(...)` + `ComputeFilterSignature(...)` used by `DiscoverHierarchy`/`BrowseChildren` paging).
3. Run: `dotnet build` — Expected: PASS.
4. Commit: `feat(galaxyrepo): hierarchy cache + snapshot + refresh service`
### Task 5: Port the reusable gRPC service + DI extension
**Classification:** standard
**Estimated implement time:** ~5 min
**Parallelizable with:** none
**Files:**
- Create: `.../Grpc/GalaxyRepositoryGrpcService.cs`
- Create: `.../DependencyInjection/GalaxyRepositoryServiceCollectionExtensions.cs`
**Steps:**
1. Port `GalaxyRepositoryGrpcService` from mxaccessgw's `Grpc/GalaxyRepositoryGrpcService.cs`, but REMOVE the mxaccessgw-specific `IGatewayRequestIdentityAccessor`/`ApiKeyConstraints` browse-subtree filtering (the gateway will apply its own auth at the interceptor layer). Keep `DiscoverHierarchy`, `BrowseChildren`, `TestConnection`, `GetLastDeployTime`, `WatchDeployEvents`. Base class: `ZB.MOM.WW.GalaxyRepository.Grpc.GalaxyRepository.GalaxyRepositoryBase`.
2. Write `AddZbGalaxyRepository(this IServiceCollection, IConfiguration, string sectionPath)` modeled on mxaccessgw's `AddGalaxyRepository` — bind options from `sectionPath`, register `GalaxyRepository`/`IGalaxyRepository`, notifier, snapshot store, cache, and the refresh `HostedService`. Add a companion `MapZbGalaxyRepository(this IEndpointRouteBuilder)` that `MapGrpcService<GalaxyRepositoryGrpcService>()`.
3. Run: `dotnet build` — Expected: PASS.
4. Commit: `feat(galaxyrepo): reusable gRPC service + AddZbGalaxyRepository DI`
### Task 6: Unit tests for the projector + DI smoke; pack
**Classification:** standard
**Estimated implement time:** ~5 min
**Parallelizable with:** none
**Files:**
- Create: `.../tests/ZB.MOM.WW.GalaxyRepository.Tests/GalaxyHierarchyProjectorTests.cs`
- Create: `.../tests/ZB.MOM.WW.GalaxyRepository.Tests/GalaxyHierarchyCacheTests.cs`
**Steps:**
1. **Write failing tests first:** projector paging (page_token round-trip, max_depth, `historized_only`/`alarm_bearing_only` filters, attribute include toggle) against a hand-built `GalaxyHierarchyCacheEntry` fixture; cache first-load gate + snapshot restore using a fake `IGalaxyRepository`. (SQL provider itself is exercised by env-gated integration later — no live DB in unit tests.)
2. Run: `dotnet test .../ZB.MOM.WW.GalaxyRepository.slnx` — Expected: FAIL (types/asserts).
3. Implement any small helper gaps surfaced; re-run — Expected: PASS.
4. Run: `dotnet pack .../src/ZB.MOM.WW.GalaxyRepository/ZB.MOM.WW.GalaxyRepository.csproj -c Release` — Expected: `ZB.MOM.WW.GalaxyRepository.0.1.0.nupkg` produced.
5. Commit: `test(galaxyrepo): projector + cache tests; pack 0.1.0`
---
## Phase 1 — Sidecar repo scaffold + vendor histsdk
### Task 7: Vendor the histsdk client + its golden tests
**Classification:** high-risk
**Estimated implement time:** ~5 min
**Parallelizable with:** Task 1
**Files:**
- Create: `~/Desktop/HistorianGateway/src/vendor/AVEVA.Historian.Client/**` (copied)
- Create: `~/Desktop/HistorianGateway/tests/AVEVA.Historian.Client.Tests/**` (copied)
- Create: `~/Desktop/HistorianGateway/src/vendor/AVEVA.Historian.Client/VENDORING.md`
**Steps:**
1. `mkdir -p ~/Desktop/HistorianGateway/src/vendor ~/Desktop/HistorianGateway/tests`. Copy `/tmp/histsdk-explore/src/AVEVA.Historian.Client/` and `/tmp/histsdk-explore/tests/AVEVA.Historian.Client.Tests/` into those locations.
2. In the vendored test `.csproj`, REMOVE the `ProjectReference` to `tools/AVEVA.Historian.ReverseEngineering` (not vendored) and delete any test classes that depend on that tooling namespace (the RE-sanitizer tests). KEEP the protocol/golden tests: `HistorianTagWriteProtocolTests`, `HistorianEventRowProtocolTests`, `GrpcEventSendProtocolTests`, `WcfDataQueryProtocolTests`, `StoreForwardOutboxTests`, `RedundancyTests`, version-gate tests. Fix the surviving test `.csproj` ProjectReference path to the new vendored client location.
3. Keep namespace `AVEVA.Historian.Client` as-is (eases re-sync). Write `VENDORING.md` recording: source repo `gitea.dohertylan.com/dohertj2/histsdk`, the commit/date of the snapshot, and "do not hand-edit; re-vendor from upstream."
4. Run: `dotnet build ~/Desktop/HistorianGateway/src/vendor/AVEVA.Historian.Client/AVEVA.Historian.Client.csproj` then `dotnet test ~/Desktop/HistorianGateway/tests/AVEVA.Historian.Client.Tests/` — Expected: build PASS; golden/offline tests PASS (live env-gated tests skip).
5. Commit (in the new repo, after Task 8 inits it — if running before Task 8, defer the commit): `chore(vendor): vendor histsdk AVEVA.Historian.Client + golden tests`
### Task 8: Initialize the sidecar repo + solution + Directory.Build.props
**Classification:** small
**Estimated implement time:** ~3 min
**Parallelizable with:** none (Task 7 output is added here)
**Files:**
- Create: `~/Desktop/HistorianGateway/.gitignore`
- Create: `~/Desktop/HistorianGateway/Directory.Build.props`
- Create: `~/Desktop/HistorianGateway/ZB.MOM.WW.HistorianGateway.slnx`
**Steps:**
1. `git -C ~/Desktop/HistorianGateway init` (this IS its own app repo — unlike shared libs). Add a .NET `.gitignore`.
2. `Directory.Build.props`: `net10.0`, `Nullable`/`ImplicitUsings` enable, `<Platforms>x64</Platforms>`, `<PlatformTarget>x64</PlatformTarget>`, common `LangVersion`.
3. Create `.slnx` referencing: `src/vendor/AVEVA.Historian.Client`, `tests/AVEVA.Historian.Client.Tests` (and the projects added in later phases — add them as created).
4. Run: `dotnet build ~/Desktop/HistorianGateway/ZB.MOM.WW.HistorianGateway.slnx` — Expected: PASS.
5. Commit: `chore: init repo + solution + Directory.Build.props` (then re-commit Task 7's vendored tree if it was deferred).
---
## Phase 2 — Host + configuration + shared-package wiring
### Task 9: Create the Contracts project + historian_gateway.proto skeleton
**Classification:** standard
**Estimated implement time:** ~5 min
**Parallelizable with:** Task 1
**Files:**
- Create: `~/Desktop/HistorianGateway/src/ZB.MOM.WW.HistorianGateway.Contracts/ZB.MOM.WW.HistorianGateway.Contracts.csproj`
- Create: `.../Contracts/Protos/historian_gateway.proto`
**Steps:**
1. `.csproj` net10.0, `Grpc.AspNetCore` 2.76.0, `<Protobuf Include="Protos\*.proto" GrpcServices="Both" />`.
2. Author `historian_gateway.proto` (`package historian_gateway.v1; option csharp_namespace = "ZB.MOM.WW.HistorianGateway.Contracts.Grpc";`) with the **service stubs and message shells** for the 4 historian services: `HistorianRead` (ReadRaw/ReadAggregate/ReadBlocks/ReadEvents server-streaming, ReadAtTime unary), `HistorianWrite` (AddHistoricalValues, SendEvent, WriteLiveValues), `HistorianTags` (BrowseTagNames streaming, GetTagMetadata, EnsureTags, DeleteTags, RenameTags, AddTagExtendedProperties), `HistorianStatus` (Probe, GetConnectionStatus, GetStoreForwardStatus, GetSystemParameter). Map the message fields to the vendored `HistorianSample`/`HistorianAggregateSample`/`HistorianEvent`/`HistorianTagMetadata`/`HistorianHistoricalValue` shapes (timestamps as `google.protobuf.Timestamp`, `RetrievalMode` as an enum mirroring the SDK's 15 modes).
3. Run: `dotnet build` — Expected: PASS; gateway gRPC base classes generated. Add project to `.slnx`.
4. Commit: `feat(contracts): historian_gateway.v1 proto + generated types`
### Task 10: Create the Server project + minimal boot
**Classification:** standard
**Estimated implement time:** ~5 min
**Parallelizable with:** none
**Files:**
- Create: `.../src/ZB.MOM.WW.HistorianGateway.Server/ZB.MOM.WW.HistorianGateway.Server.csproj`
- Create: `.../Server/Program.cs`
- Create: `.../Server/appsettings.json`, `.../Server/appsettings.Development.json`
**Steps:**
1. `.csproj` (Sdk `Microsoft.NET.Sdk.Web`): PackageReferences exactly mirroring mxaccessgw's Server csproj versions — `Grpc.AspNetCore` 2.76.0, `ZB.MOM.WW.Auth.{Abstractions,Ldap,ApiKeys,AspNetCore}` 0.1.2, `ZB.MOM.WW.Audit` 0.1.0, `ZB.MOM.WW.Theme` 0.3.1, `ZB.MOM.WW.Configuration` 0.1.0, `ZB.MOM.WW.Health` 0.1.0, `ZB.MOM.WW.Telemetry`+`.Serilog` 0.1.0, `Serilog.AspNetCore`/`.Sinks.Console`/`.Sinks.File`, `Microsoft.Data.Sqlite` 10.0.7, `Microsoft.Data.SqlClient` 6.0.2, `Polly.Core` 8.6.6. ProjectReferences: Contracts + vendored `AVEVA.Historian.Client` + `ZB.MOM.WW.GalaxyRepository` (project ref to the scadaproj lib, or pkg ref to its 0.1.0 nupkg).
2. `Program.cs`: minimal `WebApplication` that calls `AddZbSerilog`/`AddZbTelemetry` (ServiceName `historian-gateway`), `builder.Services.AddGrpc()`, maps `/healthz` + `/metrics` via `MapZbHealth`/`MapZbMetrics`, boots. (Subsystems wired in later tasks.)
3. Run: `dotnet build` then `dotnet run --project .../Server` and `curl -s localhost:<port>/healthz` — Expected: 200; `curl /metrics` returns Prometheus text. Add project to `.slnx`.
4. Commit: `feat(server): host scaffold + telemetry/serilog/health boot`
### Task 11: Configuration options + validators + ConfigPreflight
**Classification:** standard
**Estimated implement time:** ~5 min
**Parallelizable with:** none
**Files:**
- Create: `.../Server/Configuration/HistorianOptions.cs` + `HistorianOptionsValidator.cs`
- Create: `.../Server/Configuration/GalaxyOptions.cs` (thin wrapper / reuse `GalaxyRepositoryOptions`)
- Create: `.../Server/Configuration/RuntimeDbOptions.cs` + validator (SQL live-write)
- Create: `.../Server/Configuration/RedundancyOptions.cs` + validator
- Create: `.../Server/Configuration/StoreForwardOptions.cs` + validator
- Modify: `.../Server/Program.cs` (register `AddValidatedOptions<,>` + run `ConfigPreflight`)
- Test: `.../tests/ZB.MOM.WW.HistorianGateway.Tests/Configuration/ValidatorTests.cs`
**Steps:**
1. **Write failing validator tests first** using `OptionsValidatorBase`/`ValidationBuilder` semantics (e.g., missing `Historian:Host` → failure; bad port → failure; `Transport` one-of; redundancy `MinCount(members,1)` when enabled). Run — Expected: FAIL.
2. Implement options records + validators (subclass `OptionsValidatorBase<T>`, use `ValidationBuilder.Required/Port/HostPort/OneOf/PositiveTimeSpan/MinCount`). Map `HistorianOptions` → vendored `HistorianClientOptions` (Host, Port default 32565, `Transport=RemoteGrpc`, `GrpcUseTls`, credentials, `AllowUntrustedServerCertificate`).
3. In `Program.cs`, `AddValidatedOptions<,>` each, and run a `ConfigPreflight` (RequireValue host, RequirePort) before host build.
4. Run: `dotnet test` — Expected: PASS.
5. Commit: `feat(server): validated options + ConfigPreflight`
---
## Phase 3 — Connection layer (vendored client → gateway)
### Task 12: `IHistorianClient` seam over the vendored client
**Classification:** standard
**Estimated implement time:** ~5 min
**Parallelizable with:** none
**Files:**
- Create: `.../Server/Historian/IHistorianClient.cs` (interface mirroring the read/write methods the services need)
- Create: `.../Server/Historian/VendoredHistorianClient.cs` (adapts `AVEVA.Historian.Client.HistorianClient`)
- Test: `.../tests/.../Historian/HistorianClientSeamTests.cs`
**Steps:**
1. **Write failing test** that a `FakeHistorianClient : IHistorianClient` can be substituted and returns canned samples (this seam is what makes the gRPC services unit-testable without a live historian). Run — Expected: FAIL.
2. Define `IHistorianClient` with the methods the services call (ReadRaw/ReadAggregate/ReadAtTime/ReadBlocks/ReadEvents/BrowseTagNames/GetTagMetadata/Probe/GetConnectionStatus/GetStoreForwardStatus/GetSystemParameter/AddHistoricalValues/SendEvent/EnsureTag/DeleteTag/RenameTags/AddTagExtendedProperties). Implement `VendoredHistorianClient` delegating to the real `HistorianClient`.
3. Run: `dotnet test` — Expected: PASS.
4. Commit: `feat(historian): IHistorianClient seam + vendored adapter`
### Task 13: Connection pool (pre-authenticated, reused, health-checked)
**Classification:** high-risk
**Estimated implement time:** ~5 min
**Parallelizable with:** none
**Files:**
- Create: `.../Server/Historian/HistorianConnectionPool.cs` (+ `IHistorianConnectionPool`)
- Modify: `.../Server/Program.cs` (DI singleton)
- Test: `.../tests/.../Historian/HistorianConnectionPoolTests.cs`
**Steps:**
1. **Write failing test** asserting the pool opens/authenticates a connection once and reuses it across N borrow calls (count handshakes via a fake transport/lease factory), and that a faulted connection is evicted + re-created. Run — Expected: FAIL.
2. Implement a lease-based pool keyed by target; lazy-open with the auth handshake once; reuse; `SemaphoreSlim`-guarded reconnect on fault; expose `Lease()` returning a pooled `IHistorianClient`. (The vendored client is `IAsyncDisposable`; the pool owns lifecycle.)
3. Run: `dotnet test` — Expected: PASS.
4. Commit: `feat(historian): pooled pre-authenticated connection pool`
### Task 14: Store-forward + redundancy + SQL live-write wiring
**Classification:** high-risk
**Estimated implement time:** ~5 min
**Parallelizable with:** none
**Files:**
- Create: `.../Server/Historian/HistorianWriteCoordinator.cs` (routes writes → pool, store-forward, or redundancy per config)
- Create: `.../Server/Historian/SqlLiveValueWriter.cs` (`WriteLiveValues` via `aaAnalogTagInsert` + `INSERT INTO History`)
- Modify: `.../Server/Program.cs`
- Test: `.../tests/.../Historian/HistorianWriteCoordinatorTests.cs`, `.../SqlLiveValueWriterTests.cs`
**Steps:**
1. **Write failing tests:** (a) when store-forward enabled + historian unreachable, the coordinator enqueues (uses vendored `HistorianStoreForwardWriter` over a fake sink) and reports `Queued`; (b) when redundancy configured, it fans out via `HistorianRedundantClient` and returns per-member results under All/Any; (c) `SqlLiveValueWriter` builds the correct parameterized command sequence (assert against a fake `IDbCommand` recorder — no live SQL). Run — Expected: FAIL.
2. Implement the coordinator (compose vendored `HistorianStoreForwardWriter` + `HistorianRedundantClient` from config) and `SqlLiveValueWriter` (omit the server-managed `Quality` column; honor the storage-activation note from the SQL reference memory).
3. Run: `dotnet test` — Expected: PASS.
4. Commit: `feat(historian): write coordinator (store-forward + redundancy) + SQL live-write`
---
## Phase 4 — gRPC services + auth interceptor
### Task 15: `HistorianRead` service (representative TDD task; sets the pattern)
**Classification:** standard
**Estimated implement time:** ~5 min
**Parallelizable with:** Task 17 after the mapper exists
**Files:**
- Create: `.../Server/Grpc/HistorianReadService.cs`
- Create: `.../Server/Grpc/HistorianProtoMapper.cs` (SDK model ↔ proto)
- Modify: `.../Server/Program.cs` (`MapGrpcService`)
- Test: `.../tests/.../Grpc/HistorianReadServiceTests.cs`
**Steps:**
1. **Write failing test:** with a `FakeHistorianClient` yielding 3 `HistorianSample`s, calling `ReadRaw` streams 3 mapped proto rows; `ReadAggregate` passes the right `RetrievalMode`+interval; an unknown tag → `RpcException(NotFound)`; bad time range → `InvalidArgument`. Use an in-memory `IServerStreamWriter<T>` capture. Run — Expected: FAIL.
2. Implement `HistorianReadService : HistorianRead.HistorianReadBase` consuming `IHistorianConnectionPool.Lease()`; implement `HistorianProtoMapper` (Timestamp conversions, RetrievalMode enum map). Map exceptions per design §7.
3. Run: `dotnet test` — Expected: PASS.
4. Commit: `feat(grpc): HistorianRead service + proto mapper`
### Task 16: `HistorianWrite` service
**Classification:** standard
**Estimated implement time:** ~5 min
**Parallelizable with:** Task 17, Task 18 (no file overlap)
**Files:** Create `.../Server/Grpc/HistorianWriteService.cs`; Modify `Program.cs`; Test `.../Grpc/HistorianWriteServiceTests.cs`
**Steps:** TDD per the Task 15 pattern. `AddHistoricalValues`/`SendEvent` route through `HistorianWriteCoordinator`; `WriteLiveValues` through `SqlLiveValueWriter`. Map `ProtocolEvidenceMissingException``Unimplemented`, unreachable+store-forward → `OK` with `Queued` status, redundancy per-member results into the reply. Commit: `feat(grpc): HistorianWrite service`.
### Task 17: `HistorianTags` service
**Classification:** standard
**Estimated implement time:** ~5 min
**Parallelizable with:** Task 16, Task 18
**Files:** Create `.../Server/Grpc/HistorianTagsService.cs`; Modify `Program.cs`; Test `.../Grpc/HistorianTagsServiceTests.cs`
**Steps:** TDD. `BrowseTagNames` (streaming), `GetTagMetadata`, `EnsureTags`/`DeleteTags`/`RenameTags`/`AddTagExtendedProperties` via the seam/pool. Map unsupported tag types (`ProtocolEvidenceMissingException`) → `FailedPrecondition`. Commit: `feat(grpc): HistorianTags service`.
### Task 18: `HistorianStatus` service
**Classification:** small
**Estimated implement time:** ~4 min
**Parallelizable with:** Task 16, Task 17
**Files:** Create `.../Server/Grpc/HistorianStatusService.cs`; Modify `Program.cs`; Test `.../Grpc/HistorianStatusServiceTests.cs`
**Steps:** TDD. `Probe`/`GetConnectionStatus`/`GetStoreForwardStatus`/`GetSystemParameter`. Commit: `feat(grpc): HistorianStatus service`.
### Task 19: Galaxy gRPC wiring (consume the shared lib)
**Classification:** small
**Estimated implement time:** ~3 min
**Parallelizable with:** Task 1618
**Files:** Modify `.../Server/Program.cs` (`AddZbGalaxyRepository(config, "Galaxy")` + `MapZbGalaxyRepository()`); Modify `appsettings.json`
**Steps:** Register the shared lib's service + refresh hosted service; add `Galaxy:ConnectionString` config. Run: `dotnet run` + grpcurl `DiscoverHierarchy` against a fake/empty config returns `Unavailable` until cache loads (no live DB needed to prove wiring). Commit: `feat(server): wire shared GalaxyRepository gRPC service`.
### Task 20: API-key auth interceptor + scope resolver
**Classification:** high-risk
**Estimated implement time:** ~5 min
**Parallelizable with:** none
**Files:**
- Create: `.../Server/Security/GatewayGrpcScopeResolver.cs` (maps request type → scope)
- Create: `.../Server/Security/GatewayGrpcAuthorizationInterceptor.cs`
- Create: `.../Server/Security/GatewayScopes.cs` (`historian:read|write`, `historian:tags:write`, `galaxy:read`)
- Modify: `.../Server/Program.cs` (`AddZbApiKeyAuth` + `AddGrpc(o => o.Interceptors.Add<...>())`)
- Test: `.../tests/.../Security/GrpcAuthorizationTests.cs`
**Steps:**
1. **Write failing tests:** missing/invalid key → `Unauthenticated`; valid key without the required scope → `PermissionDenied`; valid key with scope → continuation runs. Fake `IApiKeyVerifier`. Run — Expected: FAIL.
2. Implement modeled on mxaccessgw's `GatewayGrpcAuthorizationInterceptor` + `GatewayGrpcScopeResolver` (switch on request type → scope), using shared `IApiKeyVerifier.VerifyAsync`. Respect a `Disabled` auth mode for dev.
3. Run: `dotnet test` — Expected: PASS.
4. Commit: `feat(security): gRPC API-key interceptor + scope enforcement`
---
## Phase 5 — Audit
### Task 21: Canonical SQLite audit writer + actor accessor + wiring
**Classification:** standard
**Estimated implement time:** ~5 min
**Parallelizable with:** Task 22 (dashboard auth) after interfaces exist
**Files:**
- Create: `.../Server/Audit/SqliteAuditWriter.cs` (`IAuditWriter`), `.../Server/Audit/HttpAuditActorAccessor.cs` (`IAuditActorAccessor`)
- Modify: write services (Tasks 16,17) + interceptor (Task 20) to emit `AuditEvent`s
- Modify: `.../Server/Program.cs` (`AddZbAudit` + register writer/actor)
- Test: `.../tests/.../Audit/SqliteAuditWriterTests.cs`
**Steps:**
1. **Write failing test:** writing an `AuditEvent` persists a row with the canonical 9 fields (`EventId`/`OccurredAtUtc`/`Actor`/`Action`/`Outcome`/`Category`/`Target`/`SourceNode`/`DetailsJson`), domain fields in `DetailsJson`; writer swallows internal errors. Use an in-memory SQLite. Run — Expected: FAIL.
2. Implement the SQLite writer (table create-if-missing) modeled on MxGateway's audit store; `HttpAuditActorAccessor` reads the Auth principal. Emit audit at tag/value/event writes, API-key admin, login/logout, with `Actor` from the accessor.
3. Run: `dotnet test` — Expected: PASS.
4. Commit: `feat(audit): canonical SQLite audit writer + actor wiring`
---
## Phase 6 — Blazor dashboard
### Task 22: Dashboard shell, LDAP cookie auth, login/logout
**Classification:** high-risk
**Estimated implement time:** ~5 min
**Parallelizable with:** none
**Files:**
- Create: `.../Server/Dashboard/Components/{App,Routes,_Imports}.razor`, `Layout/{MainLayout,LoginLayout}.razor`, `Pages/Login.razor`
- Create: `.../Server/Dashboard/DashboardServiceCollectionExtensions.cs`, `.../Dashboard/DashboardEndpointRouteBuilderExtensions.cs`, `.../Dashboard/DashboardAuthenticator.cs`, `.../Dashboard/DashboardGroupRoleMapper.cs`
- Modify: `Program.cs` (`AddGatewayDashboard` + `MapRazorComponents<App>` + auth/antiforgery middleware)
- Test: `.../tests/ZB.MOM.WW.HistorianGateway.Tests/bUnit/LayoutRenderTests.cs`
**Steps:**
1. **Write failing bUnit test** that `MainLayout` renders `<ThemeShell>` with the nav rail and `LoginCard` renders on the login page. Run — Expected: FAIL.
2. Port the dashboard shell from mxaccessgw (`App.razor` with `ThemeHead`/`ThemeScripts`, `MainLayout` with `ThemeShell`+`NavRailSection`/`NavRailItem`, `Login.razor` using `LoginCard` posting to `/auth/login`). Wire `AddZbLdapAuth(config,"Ldap")`, cookie auth via `ZbCookieDefaults.Apply`, `IGroupRoleMapper<CanonicalRole>`, `DisableLogin` switch, `IAuditActorAccessor`.
3. Run: `dotnet test` (bUnit) then `dotnet run` and load `/login` in a browser/curl — Expected: tests PASS; login page renders themed.
4. Commit: `feat(dashboard): Theme shell + LDAP cookie auth + login`
### Task 23: Status + Health pages
**Classification:** small
**Estimated implement time:** ~4 min
**Parallelizable with:** Task 24, Task 25
**Files:** Create `.../Dashboard/Components/Pages/{StatusPage,HealthPage}.razor` (+ a `DashboardStatusService`); Test bUnit render.
**Steps:** TDD bUnit render. Status shows pool state, store-forward queue depth, redundancy members, version (from a status service reading the pool/coordinator). Commit: `feat(dashboard): status + health pages`.
### Task 24: Galaxy browser page
**Classification:** small
**Estimated implement time:** ~5 min
**Parallelizable with:** Task 23, Task 25
**Files:** Create `.../Dashboard/Components/Pages/GalaxyBrowserPage.razor` + tree node view (port mxaccessgw `BrowsePage`/`BrowseTreeNodeView`, read-only, no add-tag); Test bUnit.
**Steps:** TDD bUnit render against the shared lib's cache. Commit: `feat(dashboard): read-only Galaxy browser`.
### Task 25: Historian console page (query + role-gated write test)
**Classification:** standard
**Estimated implement time:** ~5 min
**Parallelizable with:** Task 23, Task 24
**Files:** Create `.../Dashboard/Components/Pages/HistorianConsolePage.razor` (+ `DashboardHistorianService` calling the seam/pool); Test bUnit.
**Steps:** TDD bUnit. Query form (tag, time range, raw/aggregate + mode picker) renders results; write-test panel (historical value insert / event send) visible only to Engineer+ roles via `AuthorizeView`. Commit: `feat(dashboard): historian query + role-gated write console`.
### Task 26: API-key admin page
**Classification:** standard
**Estimated implement time:** ~5 min
**Parallelizable with:** Task 2325
**Files:** Create `.../Dashboard/Components/Pages/ApiKeysPage.razor` (+ `DashboardApiKeyManagementService` over the shared ApiKeys store); Test bUnit.
**Steps:** TDD bUnit. List/create (show secret once)/revoke keys with scope selection. Commit: `feat(dashboard): API-key admin`.
---
## Phase 7 — Telemetry meters + Health probes
### Task 27: App meters
**Classification:** small
**Estimated implement time:** ~4 min
**Parallelizable with:** Task 28
**Files:** Create `.../Server/Observability/GatewayMetrics.cs`; Modify services/coordinator/pool to record; Modify `Program.cs` (`o.Meters=[GatewayMetrics.MeterName]`); Test `.../Observability/GatewayMetricsTests.cs`.
**Steps:** TDD with `MeterListener`. Counters/histograms: read/write counts + latency, store-forward queue depth (observable gauge), pool connection state, redundancy ack outcomes. Commit: `feat(obs): gateway meters`.
### Task 28: Health probes
**Classification:** small
**Estimated implement time:** ~4 min
**Parallelizable with:** Task 27
**Files:** Create `.../Server/Health/{HistorianConnectionHealthCheck,StoreForwardDrainHealthCheck}.cs`; Modify `Program.cs` (`AddHealthChecks` with `GrpcDependencyHealthCheck` for historian, SQL checks for Galaxy + Runtime DB, custom checks, tagged `ZbHealthTags.Ready`); Test health-check unit tests.
**Steps:** TDD. Probes flip Unhealthy when a dependency is down (fake deps). Commit: `feat(health): historian/galaxy/runtime-db/store-forward probes`.
---
## Phase 8 — Integration, docs, repo
### Task 29: Env-gated live integration tests
**Classification:** standard
**Estimated implement time:** ~5 min
**Parallelizable with:** none
**Files:** Create `.../tests/.../Integration/{HistorianRoundTripTests,GalaxyBrowseTests}.cs`
**Steps:** Gated on `HISTORIAN_GRPC_HOST`/`HISTORIAN_GRPC_WRITE_SANDBOX_TAG` and a Galaxy SQL connection env var; `Skip` when absent. Cover read→write→read-back via the self-cleaning sandbox-tag lifecycle and a Galaxy `DiscoverHierarchy`. Run `dotnet test` (skips locally). Commit: `test: env-gated live integration`.
### Task 30: Full-suite green gate + smoke
**Classification:** small
**Estimated implement time:** ~3 min
**Parallelizable with:** none
**Steps:** Run `dotnet build ZB.MOM.WW.HistorianGateway.slnx` + `dotnet test` (whole solution) on macOS with no live env — Expected: ALL green, live tests skipped. `dotnet run` + curl `/healthz` (200), `/metrics` (text), grpcurl `HistorianStatus/Probe`. Fix any gaps. Commit: `chore: green gate + smoke`.
### Task 31: CLAUDE.md + README + gitea remote + scadaproj index
**Classification:** small
**Estimated implement time:** ~5 min
**Parallelizable with:** none
**Files:** Create `~/Desktop/HistorianGateway/{CLAUDE.md,README.md}`; copy the two design/plan docs into its `docs/plans/`; Modify `~/Desktop/scadaproj/CLAUDE.md` (index the new sidecar + note the GalaxyRepository follow-on for mxaccessgw).
**Steps:**
1. Write `CLAUDE.md` (overview, build/run/test commands, the no-COM single-process note, the vendored-histsdk + shared-GalaxyRepository dependencies, config sections, env vars) and `README.md`.
2. Create the gitea repo `historiangw` and push: `git -C ~/Desktop/HistorianGateway remote add origin https://gitea.dohertylan.com/dohertj2/historiangw.git && git push -u origin main` (confirm remote name/visibility with the user first).
3. Update scadaproj's umbrella `CLAUDE.md` runtime/implementation table with the new project row; commit scadaproj separately.
4. Commit: `docs: CLAUDE.md + README; index in scadaproj`.
---
## Dependency summary (for parallel dispatch)
- **Foundational, no blockers:** Task 1 (galaxy lib scaffold), Task 7 (vendor histsdk), Task 8 (repo init) — Task 8 consumes Task 7's tree.
- **Galaxy lib chain:** 2→3→4→5→6 (sequential; share files).
- **Sidecar chain:** 8→9→10→11→12→13→14, then gRPC services 15→(16,17,18 parallel),19, then 20, then 21.
- **Dashboard:** 22→(23,24,25,26 parallel) after Task 20 (auth) + Task 13/14 (data) + Task 5/19 (galaxy).
- **Obs:** 27,28 parallel after Task 14.
- **Close-out:** 29→30→31 after everything.
## Notes / non-goals (from design §9)
- No `AddS2` live streaming-sample writes (GATED) — live values only via SQL `WriteLiveValues`.
- No two-process/x86 worker (no COM).
- mxaccessgw adopting `ZB.MOM.WW.GalaxyRepository` is a tracked follow-on, NOT in this plan.