code-review: 2026-05-28 baseline re-review of all 23 modules at 1eb6e97

Re-applies the full 10-category checklist to every src/ project — including
first-time reviews of the four newer components (AuditLog, NotificationOutbox,
SiteCallAudit, Transport) — so the code-reviews/ index reflects today's
codebase rather than the 2026-05-16 baseline. 172 new Open findings (0
Critical, 18 High, 62 Medium, 92 Low); 481 findings total across 23 modules.

regen-readme.py now derives each module's Last reviewed + Commit from its
findings.md header instead of hard-coding 2026-05-16 / 9c60592, so future
single-module re-reviews show their own date in the Module Status table.
This commit is contained in:
Joseph Doherty
2026-05-28 02:55:47 -04:00
parent 1eb6e972b0
commit f93b7b99bb
25 changed files with 8793 additions and 115 deletions
+335 -3
View File
@@ -5,10 +5,10 @@
| Module | `src/ScadaLink.ManagementService` |
| Design doc | `docs/requirements/Component-ManagementService.md` |
| Status | Reviewed |
| Last reviewed | 2026-05-17 |
| Last reviewed | 2026-05-28 |
| Reviewer | claude-agent |
| Commit reviewed | `39d737e` |
| Open findings | 0 (1 Deferred — see ManagementService-012) |
| Commit reviewed | `1eb6e97` |
| Open findings | 6 (1 Deferred — see ManagementService-012) |
## Summary
@@ -46,6 +46,32 @@ that can leave an instance partially modified after an error (015, Medium), raw
messages from unexpected faults being returned verbatim to HTTP callers (016, Low), and
`QueryDeploymentsCommand` having no test coverage at all (017, Low).
#### Re-review 2026-05-28 (commit `1eb6e97`)
All seventeen prior findings remain correctly closed; ManagementService-012 is still the
only Deferred entry (marker-interface on `ManagementEnvelope.Command` still belongs in the
Commons module). The module has grown substantially since the last review (`+1997 lines`):
the Transport (#24) bundle commands (`ExportBundle`/`PreviewBundle`/`ImportBundle`) have
been added to `ManagementActor`, and a new `AuditEndpoints.cs` (`/api/audit/query` and
`/api/audit/export`) ships alongside the existing `/management` endpoint. This re-review
re-ran the full 10-category checklist and surfaced **six new findings**. The dominant
theme is the same authorization gap that findings 001/002/003/014 closed for the
ManagementActor, now resurfacing in the new surfaces:
**QueryAuditLogCommand has no role gate at all** (018, High) — any authenticated user can
read the configuration audit log via `/management`, even though the parallel
`/api/audit/query` requires `OperationalAuditRoles`. The new `/api/audit/{query,export}`
endpoints build an `AuthenticatedUser` with `PermittedSiteIds` but never enforce site scope
(019, Medium) — although audit roles are not site-scoped by design, the user-supplied
`sourceSiteId` filter is honoured verbatim. `HandleUpdateSmtpConfig` returns the full
SmtpConfiguration entity (including the `Credentials` field, which can carry SMTP passwords
/ OAuth2 client secrets) in the response and audit row (020, Medium). The Transport (#24)
bundle commands have zero test coverage in `ManagementActorTests` (021, Medium) — neither
role gating nor success/error paths. The `Component-ManagementService.md` design doc is
stale on three fronts: it does not mention Transport bundle commands, the `/api/audit/*`
endpoints, or the now-wired `CommandTimeout` option (022, Low). Finally,
`HandleQueryDeployments` issues one `GetInstanceByIdAsync` per unique instance ID when
filtering for a site-scoped user — an N+1 read pattern on the unfiltered branch (023, Low).
## Checklist coverage
| # | Category | Examined | Notes |
@@ -61,6 +87,21 @@ messages from unexpected faults being returned verbatim to HTTP callers (016, Lo
| 9 | Testing coverage | + | Authorization is well covered; site-scope enforcement, the HTTP endpoint, `DebugStreamHub`, and remote-query handlers have no tests. See 013. |
| 10 | Documentation & comments | + | XML docs are accurate where present; `ManagementServiceOptions` and `ResolveRolesCommand` paths are undocumented dead code (010, 011). |
_Re-review (2026-05-28, `1eb6e97`):_
| # | Category | Examined | Notes |
|---|----------|----------|-------|
| 1 | Correctness & logic bugs | + | `HandleImportBundle` correctly dedupes resolutions per (entity,name); `ParseDocument` still allocates a `JsonDocument.Parse("{}")` on the failure path but the caller's `using` disposes it. No new defects. |
| 2 | Akka.NET conventions | + | PipeTo dispatch from 004 is intact; supervision strategy from 005 is intact; `Sender` correctly captured to local before PipeTo. No new findings. |
| 3 | Concurrency & thread safety | + | Bundle handlers `await` cleanly; `BundleSession` is not cleaned up if `PreviewAsync`/`ApplyAsync` throws, but that is an `IBundleImporter` contract concern outside this module. No new findings. |
| 4 | Error handling & resilience | + | `ManagementCommandException` from 016 is applied consistently across the new bundle handlers (curated `CryptographicException`/`ArgumentException` paths). No new findings. |
| 5 | Security | + | `QueryAuditLogCommand` has no role gate (018, High). New `/api/audit/*` endpoints build `PermittedSiteIds` but never enforce them (019, Medium). `HandleUpdateSmtpConfig` returns + audits `Credentials` verbatim (020, Medium). |
| 6 | Performance & resource management | + | `HandleQueryDeployments` unfiltered-with-scope branch is N+1 on instance lookups (023, Low). Request body up to 200 MB read into a single `string` in `HandleRequest` (acceptable per Transport bundle requirement). |
| 7 | Design-document adherence | + | `Component-ManagementService.md` is stale on Transport bundle commands, `/api/audit/*` endpoints, and the now-wired `CommandTimeout` (022, Low). |
| 8 | Code organization & conventions | + | `AuditEndpoints` duplicates the Basic Auth → LDAP → roles flow from `ManagementEndpoints` (~50 lines). Acknowledged in `AuditEndpoints` XML but worth tracking. No new finding raised. |
| 9 | Testing coverage | + | Transport bundle commands have zero `ManagementActorTests` coverage — neither role gating nor handler logic (021, Medium). |
| 10 | Documentation & comments | + | New `AuditEndpoints` XML doc is high quality. `Component-ManagementService.md` not updated for Transport/Audit endpoints (022 covers). |
## Findings
### ManagementService-001 — Remote-query and debug-snapshot handlers bypass site-scope enforcement
@@ -748,3 +789,294 @@ Resolved 2026-05-17 (commit pending). Added seven `QueryDeployments_*` tests to
Deployment user and an Admin user, in- and out-of-scope
(`_FilteredByOutOfScopeInstance_ReturnsUnauthorized`, `_FilteredByInScopeInstance_ReturnsRecords`,
`_UnfilteredForSiteScopedUser_DropsOutOfScopeRecords`, `_UnfilteredForAdminUser_ReturnsAllRecords`).
### ManagementService-018 — QueryAuditLogCommand has no role gate
| | |
|--|--|
| Severity | High |
| Category | Security |
| Status | Open |
| Location | `src/ScadaLink.ManagementService/ManagementActor.cs:153``:207`, `:336`, `:1302` |
**Description**
`QueryAuditLogCommand` is dispatched at line 336 to `HandleQueryAuditLog`, which calls
`ICentralUiRepository.GetAuditLogEntriesAsync(...)` with no role check, no site-scope
check, and no actor filter. `GetRequiredRole` (lines 153207) does not list
`QueryAuditLogCommand`, so it falls through to the `_ => null` case — i.e. "read-only
queries — any authenticated user". The parallel `/api/audit/query` endpoint in
`AuditEndpoints.HandleQuery` correctly enforces `AuthorizationPolicies.OperationalAuditRoles`
(`{ "Admin", "Audit", "AuditReadOnly" }`), so a CLI authenticated as a user with only the
`Deployment` role — or no roles at all — is rejected at `/api/audit/query` but can read
the *same* audit log table through `/management` by sending `QueryAuditLogCommand`. The
two surfaces enforce different permissions on the same data; the older
ManagementActor-routed path is the looser one. The audit log records every script-trust-
boundary action and is sensitive operationally — it should not be readable by a default
authenticated user.
This is the same authorization-bypass class as findings 001/002/014 and was missed in
that sweep because `QueryAuditLogCommand` (legacy `Action`/`EntityType` filter) is a
separate command from the new keyset-paged `IAuditLogRepository.QueryAsync` path the
`/api/audit/query` endpoint uses.
**Recommendation**
Add `QueryAuditLogCommand` to `GetRequiredRole`. The natural fit is a new
`"OperationalAudit"`-style role group — but `GetRequiredRole` returns a single string and
the project's existing role gates do too (`Admin`/`Design`/`Deployment`). Two equally
defensible options:
1. Add `QueryAuditLogCommand` to the `Admin`-required group — strict, mirrors that
`AuditExportRoles` includes `Admin`. The CLI's CLI-017/018 audit work uses
`/api/audit/query`, so `QueryAuditLogCommand` may be effectively orphaned anyway.
2. Extend `GetRequiredRole` to return a role *set* and add an `AuditRoles` group equal to
`AuthorizationPolicies.OperationalAuditRoles`, so the two surfaces converge.
Recommended: option 1 plus a deprecation comment on `QueryAuditLogCommand` pointing at
`/api/audit/query` — the legacy command's filter shape is a subset of the new endpoint's,
so the ManagementActor route is redundant. Add a regression test asserting that a
no-role / `Deployment`-only caller gets `ManagementUnauthorized` for `QueryAuditLogCommand`.
### ManagementService-019 — AuditEndpoints builds PermittedSiteIds but never enforces them
| | |
|--|--|
| Severity | Medium |
| Category | Security |
| Status | Open |
| Location | `src/ScadaLink.ManagementService/AuditEndpoints.cs:358``:368`, `:397``:437` |
**Description**
`AuditEndpoints.AuthenticateAsync` resolves the caller's roles AND `PermittedSiteIds` and
wraps them in an `AuthenticatedUser` (lines 358366), but the returned `AuthenticatedUser`
is then only used for the `HasAnyRole(...)` role check on lines 114 and 163 — its
`PermittedSiteIds` are never read. `ParseFilter` (line 397) accepts the caller-supplied
`sourceSiteId=...` query string verbatim and passes it straight into the
`IAuditLogRepository.QueryAsync` filter. A user whose `Audit` (or `AuditReadOnly`) role
mapping carries scope rules — e.g. `AuditReadOnly` scoped to "plant-a" — can still ask
for `sourceSiteId=plant-b` and get back rows for plant-b.
Today this gap is partially benign because the design treats `Audit`/`AuditReadOnly` as
non-site-scoped roles (`Component-AuditLog.md` does not list site scoping for the audit
permissions, and the LDAP role mapping UI does not currently surface site scope rules
for those roles). But (a) the `RoleMapper` will silently honour scope rules attached to
any role, including `Audit`, so an operator who *does* configure them gets a UI that
says "scoped" and an endpoint that ignores the scope — a contract violation; (b) the
`Admin` role's `PermittedSiteIds` are always empty (system-wide), so enforcing for the
other roles is cheap. The asymmetry with the `/management` endpoint — which routes every
site-targeted command through `EnforceSiteScope` — is also a maintenance hazard.
**Recommendation**
Decide explicitly whether the audit endpoints honour site scope. Two options:
1. **Honour scope** — in `HandleQuery` / `HandleExport`, after the role check, intersect
the caller-supplied `filter.SourceSiteIds` with `user.PermittedSiteIds`. If the
caller supplied no `sourceSiteId` and `PermittedSiteIds` is non-empty, restrict to
`PermittedSiteIds`. If the intersection is empty, return an empty page (or a 403 if
the caller explicitly asked for an out-of-scope site).
2. **Document the intentional bypass** — drop the `PermittedSiteIds` field from the
`AuthenticatedUser` constructed in `AuthenticateAsync` (or comment it as "ignored —
audit roles are not site-scoped") so the code stops carrying a value it does not
read, and add an XML doc note on the endpoint class that audit roles are always
system-wide by design.
Recommended: option 1, mirroring the `ManagementActor` pattern — same security posture
across both surfaces. Add a regression test that a site-scoped `AuditReadOnly` user
filtering on an out-of-scope site gets a 403 (or an empty page).
### ManagementService-020 — UpdateSmtpConfig returns and audits the SMTP Credentials field verbatim
| | |
|--|--|
| Severity | Medium |
| Category | Security |
| Status | Open |
| Location | `src/ScadaLink.ManagementService/ManagementActor.cs:1136``:1153` |
**Description**
`HandleUpdateSmtpConfig` reads the existing `SmtpConfiguration` entity, applies the
incoming command, and then **(a)** passes the full `config` object as the `afterState`
to `AuditAsync` (line 1151) — meaning the SMTP credential string is persisted in the
audit log — and **(b)** returns the full `config` to the caller (line 1152), which is
serialized via `SerializeResult` and sent back over HTTP. `SmtpConfiguration.Credentials`
carries the SMTP-Auth password (for `Basic`) or the OAuth2 client secret (for
`OAuth2ClientCredentials`); `SmtpConfiguration` has no `[JsonIgnore]` on this field
and `SerializeResult`'s `JsonSerializerOptions` does not exclude it. The pattern
parallels what ConfigurationDatabase-012 fixed for inbound API keys: a credential
artifact must not be echoed back through every read/audit path.
The credential is supplied by the operator in `UpdateSmtpConfigCommand.Credentials`,
so the caller already has it. But (1) anyone with read access to the audit log
(`OperationalAuditRoles`) can now retrieve every SMTP credential change verbatim — a
strictly larger blast radius than `Admin`-only `UpdateSmtpConfig`. (2) The serialized
`config` echo means the credential moves over the wire in the response even though the
caller has no need for it. (3) Any future read path that returns
`SmtpConfiguration``ListSmtpConfigsCommand` already does at line 1130 — will leak
the stored credential too.
**Recommendation**
Three changes, in order of priority:
1. In `HandleUpdateSmtpConfig` and `HandleListSmtpConfigs`, project to a credential-free
shape before returning — e.g. `new { config.Id, config.Host, config.Port,
config.AuthType, config.FromAddress, config.TlsMode }`. Match the
`HandleListApiKeys` pattern.
2. In `AuditAsync` for the SMTP path, pass a credential-free `afterState` (the same
anonymous shape). The fact that *something* changed is auditable; the secret value
is not.
3. Tag `SmtpConfiguration.Credentials` with `[JsonIgnore]` in Commons (out-of-scope edit
for this module, but worth a follow-up). Alternatively, configure
`ResultSerializerOptions` with a property name policy that skips a known set of
credential field names — but a per-entity projection is cleaner.
Add regression tests: `UpdateSmtpConfig_DoesNotEchoCredentialsInResponse` and
`UpdateSmtpConfig_DoesNotPersistCredentialsInAuditLog`.
### ManagementService-021 — Transport bundle handlers have zero test coverage
| | |
|--|--|
| Severity | Medium |
| Category | Testing coverage |
| Status | Open |
| Location | `tests/ScadaLink.ManagementService.Tests/ManagementActorTests.cs:1`; `src/ScadaLink.ManagementService/ManagementActor.cs:1717``:1897` |
**Description**
The three Transport (#24) bundle handlers — `HandleExportBundle`, `HandlePreviewBundle`,
`HandleImportBundle` (~180 lines of handler logic at the bottom of `ManagementActor.cs`)
— have **no tests** in `ManagementActorTests`. Specifically untested:
1. **Role gating.** `ExportBundleCommand` requires `Design`; `PreviewBundleCommand` and
`ImportBundleCommand` require `Admin`. No test asserts that the wrong role gets
`ManagementUnauthorized`. CLI-017 / CLI-018 just landed around bundle plumbing — a
future refactor that moves these commands between role groups in `GetRequiredRole`
would silently regress the gate.
2. **Name resolution in `HandleExportBundle`.** The inner `ResolveIds<T>` helper raises
`ManagementCommandException` for unknown names. The "all entity types" branch
(`cmd.All == true`) and the "missing name" branch are both untested.
3. **`HandleImportBundle` blocker rejection.** The handler aborts before `ApplyAsync`
when any `ConflictKind.Blocker` row is present; the produced error message is
curated and surfaced to the caller, but no test asserts the abort path or that the
importer's `ApplyAsync` was not called.
4. **Resolution dedupe.** `HandleImportBundle` dedupes `(EntityType, Name)` keys
last-write-wins — the dedupe is critical (CLI-014 was about it on the CLI side) but
has no actor-side regression test.
5. **`DecodeBundle` failure modes** (empty/non-base64 input) — both branches return
curated `ManagementCommandException` but neither is exercised.
6. **`ParseConflictPolicy`** for `"skip"`, `"overwrite"`, `"rename"`, and the invalid-
value branch — all untested.
Given the size and reach of the bundle path (cross-cutting central configuration
import), this gap is materially larger than usual for new handler code.
**Recommendation**
Add an `ImportBundleHandlerTests` suite covering:
- role gating for all three commands (`Design`/`Admin` mismatch -> `ManagementUnauthorized`),
- `ExportBundleCommand(All: true)` happy-path,
- `ExportBundleCommand` with an unknown name -> `ManagementError`,
- `ImportBundleCommand` with a `Blocker` row -> `ManagementError` and `ApplyAsync` not called,
- `ImportBundleCommand` with duplicate preview items -> dedupe to one resolution per (type, name),
- `DecodeBundle` empty/invalid base64,
- `ParseConflictPolicy` all four branches.
Use NSubstitute for `IBundleImporter` / `IBundleExporter` (no need for a real bundle in
the actor tests; the bundle round-trip belongs in `Transport` tests).
### ManagementService-022 — Design doc is stale on Transport bundle commands, /api/audit/* endpoints, and CommandTimeout
| | |
|--|--|
| Severity | Low |
| Category | Design-document adherence |
| Status | Open |
| Location | `docs/requirements/Component-ManagementService.md:77``:175`, `:205``:209` |
**Description**
`Component-ManagementService.md` does not mention three pieces of shipped functionality:
1. **Transport (#24) bundle commands.** `ExportBundleCommand`, `PreviewBundleCommand`,
and `ImportBundleCommand` are dispatched at `ManagementActor.cs:350``:352` and
role-gated in `GetRequiredRole` (Design for Export; Admin for Preview/Import). The
design doc's "Message Groups" section enumerates Templates, Instances, Sites, Data
Connections, Deployments, External Systems, Notifications, Security, Audit Log,
Shared Scripts, Database Connections, Inbound API Methods, Health, and Remote
Queries — but has no "Transport" / "Bundles" group. The CLI now offers `bundle
export`/`preview`/`import` (per the recent CLI-017/018 work) and points
at these commands.
2. **`/api/audit/*` endpoints.** The doc's "HTTP Management API" section (line 52)
describes only `POST /management`. `AuditEndpoints.MapAuditAPI()` adds
`GET /api/audit/query` and `GET /api/audit/export` with their own auth-and-role
path mirroring `ManagementEndpoints` (intentionally — see the `AuditEndpoints` XML
docs), but the design doc gives no signal that the module exposes more than one
route group, no per-endpoint role mapping table, and no mention that the response
shape differs (keyset cursor vs. opaque page).
3. **`CommandTimeout`.** Line 209 still says "Reserved for future configuration —
e.g., command timeout overrides", but ManagementService-010 wired the option through
`ResolveAskTimeout`. The doc is stale.
**Recommendation**
Update `Component-ManagementService.md`:
- Add a "Transport" entry to "Message Groups" listing `ExportBundle`,
`PreviewBundle`, `ImportBundle` with their per-command roles. Cross-reference
`Component-Transport.md`.
- Add an "Audit Log HTTP API" subsection under "HTTP Management API" describing
`GET /api/audit/query` (keyset cursor, `OperationalAuditRoles`) and
`GET /api/audit/export` (csv/jsonl streaming, `AuditExportRoles`, parquet 501).
Note the deliberate divergence in the source-site query-string key
(`sourceSiteId` vs CentralUI's `site`).
- In the "Configuration" table, replace "Reserved for future configuration" with the
actual `CommandTimeout` semantics: "Max time the HTTP endpoint will Ask the
ManagementActor before returning HTTP 504; falls back to 30 s when unset or
non-positive."
### ManagementService-023 — HandleQueryDeployments unfiltered branch is N+1 on instance lookup
| | |
|--|--|
| Severity | Low |
| Category | Performance & resource management |
| Status | Open |
| Location | `src/ScadaLink.ManagementService/ManagementActor.cs:1276``:1295` |
**Description**
The site-scoped unfiltered branch of `HandleQueryDeployments` (added under
ManagementService-014) reads every `DeploymentRecord` via `GetAllDeploymentRecordsAsync`,
then for each *unique* `record.InstanceId` calls
`ITemplateEngineRepository.GetInstanceByIdAsync` to resolve the instance's
`SiteId`. The handler caches results in `instanceSiteCache` so each instance is loaded
at most once per call, but for a fleet with N distinct instances having deployment
history, the handler still issues N round-trips to the configuration database to
authorize a single query. With a large deployment history the cumulative DB hit can be
material; it also runs every time a site-scoped user opens the deployments page.
This is acceptable in steady state today (sites tend to have small fleets and few
deployments) but is a textbook N+1 read pattern, and on a busy day for a site-scoped
operator the cost will dominate the request. Admin and system-wide Deployment users
correctly skip the loop (they hit only `GetAllDeploymentRecordsAsync`).
**Recommendation**
Add a batch-resolve method to `ITemplateEngineRepository` — e.g.
`Task<IDictionary<int, int>> GetInstanceSiteIdsAsync(IEnumerable<int> instanceIds)`
backed by a single EF query
(`Instances.Where(i => instanceIds.Contains(i.Id)).Select(i => new { i.Id, i.SiteId })`).
`HandleQueryDeployments` would then issue exactly two queries on the unfiltered branch
(records + sites) regardless of fleet size. The change is additive to
`ITemplateEngineRepository` and out-of-module for the actual implementation, but the
handler change is local; a quick interim alternative is to project deployment records
to include the instance's `SiteId` at the repo level, which removes the second query
entirely.
Defer until a noticeable hot path emerges, but track it: this is the only N+1 in
`ManagementActor` once 002 / 014 are folded in.