d39089f4ed
Full per-module re-review of the 16 stale modules (last seen1eb6e97/ 2026-05-28) plus first-ever reviews of KpiHistory (#26) and ScriptAnalysis (#25), at HEAD4307c381. 67 new findings (0 Critical, 6 High, 27 Medium, 34 Low). Remediation in commitfd618cf1closed 5 of the 6 Highs and ~33 Medium/Low; the rest are Deferred/Won't Fix with rationale. Remaining pending (4) are all InboundAPI's Database-helper findings (IA-026 High .. IA-029), left to the active feat/ipsen-movein effort per owner decision. Highlights: caught a central-only-delivery security drift (SMTP creds broadcast to sites — DM-025/SR-031), a never-committed 'Resolved' fix (SiteEventLogging-016 → -024), an unguarded KPI recorder tick (KH-001), a trust-analyzer fallback weakening (SA-001), and a native-alarm subscribe-path leak (DCL-023). ScriptAnalysis verdict: trust boundary is semantically sound (symbol-based) in the production cluster config. README regenerated; regen-readme.py --check passes (4 pending / 567 total).
1230 lines
74 KiB
Markdown
1230 lines
74 KiB
Markdown
# Code Review — Security
|
|
|
|
| Field | Value |
|
|
|-------|-------|
|
|
| Module | `src/ZB.MOM.WW.ScadaBridge.Security` |
|
|
| Design doc | `docs/requirements/Component-Security.md` |
|
|
| Status | Reviewed |
|
|
| Last reviewed | 2026-06-20 |
|
|
| Reviewer | claude-agent |
|
|
| Commit reviewed | `4307c381` |
|
|
| Open findings | 0 (2 deferred — Security-008, Security-022) |
|
|
|
|
## Summary
|
|
|
|
The Security module is small and reasonably structured: a stateless `LdapAuthService`
|
|
for search-then-bind authentication, a `JwtTokenService` for HMAC-signed cookie tokens,
|
|
a `RoleMapper` that resolves LDAP groups to roles, and ASP.NET Core authorization
|
|
policies plus a site-scope handler. Unit-test coverage of the happy paths is decent.
|
|
However, the review surfaced several real security weaknesses, the most serious being
|
|
that **StartTLS is dead code** (the design's "LDAPS or StartTLS" requirement is only
|
|
half met), that **the authentication cookie is not marked `Secure`** despite the design
|
|
mandating it, and that **the JWT signing key is never length-validated** so a weak or
|
|
empty key is silently accepted. There is also a genuine **DN-injection** gap in the
|
|
no-service-account fallback path, a filter/DN attribute mismatch (`uid=` vs `cn=`) that
|
|
makes that fallback path internally inconsistent, and an N+1 query in `RoleMapper`.
|
|
JWT validation also disables issuer/audience checks and the idle-timeout claim is reset
|
|
on every refresh, weakening the documented 30-minute idle policy. None of these are
|
|
crash/data-loss bugs, but the TLS, cookie, and key-validation items are security
|
|
defects that should be fixed before any production deployment.
|
|
|
|
#### Re-review 2026-05-17 (commit `39d737e`)
|
|
|
|
Re-reviewed the module after the batch-resolution of Security-001..007, 009, 010, 011.
|
|
Those fixes were verified in place and remain Resolved: the `LdapTransport` enum makes
|
|
StartTLS reachable, the auth cookie is `SecurePolicy.Always` with a sliding window, the
|
|
JWT signing key is length-validated at construction, issuer/audience are bound and
|
|
checked, the DN-injection fallback is RFC 4514-escaped, and the idle-timeout anchor is
|
|
carried across refreshes. Security-008 (N+1 scope-rule query) remains correctly
|
|
**Deferred** — `ISecurityRepository` still exposes no per-set scope-rule query, so the
|
|
N+1 cannot be removed from within `RoleMapper.cs` alone; the deferral condition is
|
|
unchanged. This pass surfaced **4 new findings** (Security-012..015): two Medium and
|
|
two Low. The most notable is that a *partial* LDAP outage during login (bind succeeds,
|
|
group search fails) silently returns an authenticated session with **zero roles**
|
|
instead of failing the login as the design's LDAP-failure rule requires
|
|
(Security-012); and that `RefreshToken` re-issues a token without ever checking
|
|
`IsIdleTimedOut`, so an idle-expired session can be renewed indefinitely if the caller
|
|
omits the separate idle check (Security-014). The two Low findings concern fragile
|
|
DN parsing of group names containing escaped commas and an un-trimmed username flowing
|
|
into the LDAP filter, fallback DN, and JWT claims.
|
|
|
|
#### Re-review 2026-05-28 (commit `1eb6e97`)
|
|
|
|
Re-reviewed the module on a fresh baseline. All Security-001..007, 009..015 fixes remain
|
|
in place; the only Open carry-over is Security-008 (still correctly **Deferred** —
|
|
`ISecurityRepository` still exposes no per-set scope-rule query, so the N+1 in
|
|
`RoleMapper` cannot be removed from within this module). The original
|
|
Security-014 fix is now load-bearing: `RefreshToken` calls `IsIdleTimedOut` before
|
|
re-issuing, and the new cookie sliding-expiry tests in `SecurityReviewRegressionTests`
|
|
pin CentralUI-005's Security-side contract. This pass surfaced **6 new findings**
|
|
(Security-016..021): one Medium correctness/security defect, one Medium design-adherence
|
|
defect, and four Low. The most consequential is **Security-016** — when a user is
|
|
mapped to *both* a system-wide Deployment LDAP group (e.g. `SCADA-Deploy-All`) and a
|
|
site-scoped Deployment LDAP group (e.g. `SCADA-Deploy-SiteA`), `RoleMapper` silently
|
|
treats the union as site-scoped (the system-wide grant is dropped); the design's
|
|
"multiple groups grant multiple independent roles" intent is not honoured for this
|
|
mix-and-match case. **Security-017** is the cross-module partner of CentralUI-028:
|
|
`SiteScopeRequirement` / `SiteScopeAuthorizationHandler` are declared and registered
|
|
but no production caller ever instantiates them — `[Authorize(Policy = RequireDeployment)]`
|
|
*does not* enforce the documented site scoping, callers must remember to inject
|
|
`SiteScopeService` and re-check `IsSiteAllowedAsync` themselves (which the two new
|
|
report pages flagged by CentralUI-028 forgot to do). The remaining Lows are: role names
|
|
are magic strings duplicated across `RoleMapper`, `SiteScopeAuthorizationHandler`, and
|
|
`AuthorizationPolicies` (Security-018); a service-account-rebind failure is reported
|
|
to the user as "Invalid username or password" — masking a misconfiguration as a
|
|
user-credential error (Security-019); required `SecurityOptions` fields
|
|
(`LdapServer`, `LdapSearchBase`) have no `IValidateOptions` startup check, so empty
|
|
values silently surface only on first login (Security-020); and the
|
|
`RequireHttpsCookie=false` dev opt-out emits no warning, so an HTTP production
|
|
deployment silently transmits the JWT bearer credential in cleartext (Security-021).
|
|
|
|
#### Re-review 2026-06-20 (commit `4307c381`) — full review
|
|
|
|
Major architectural change since `1eb6e97`: the project was renamed (ScadaLink →
|
|
ZB.MOM.WW.ScadaBridge) and the bespoke `LdapAuthService`/`LdapTransport`/`LdapAuthResult`
|
|
were **cut over to the external `ZB.MOM.WW.Auth.Ldap` library** (commit `ac34dac4`); the
|
|
prior `SiteScopeAuthorizationHandler` was deleted (Security-017 resolution), and the role
|
|
model was canonicalised + collapsed (`Admin→Administrator`, `Design→Designer`,
|
|
`Deployment→Deployer`, `Audit→Administrator`, `AuditReadOnly→Viewer`; commit `b104760b`)
|
|
with M7 `Operator`/`Verifier` added (`a0ce8b6c`). The interactive cookie session now signs
|
|
in with **bare claims** via the shared `SessionClaimBuilder` and is policed by
|
|
`CookieSessionValidator` (idle-timeout + LDAP-free DB role-refresh, fail-closed, well
|
|
tested). All prior fixes that survived the cutover remain in place (key-length guard,
|
|
issuer/audience binding, `RoleMapper` union semantics, `Roles` constants, cookie hardening
|
|
via `ZbCookieDefaults`, the `RequireHttpsCookie` warning). Security-008 (N+1 scope-rule
|
|
query) is still correctly **Deferred**. This pass surfaced **4 new findings**
|
|
(Security-022..025): three Medium and one Low. The most consequential is **Security-022** —
|
|
the documented "cookie + JWT hybrid" no longer holds at HEAD: `/auth/token` still *mints* a
|
|
bearer JWT but **no code anywhere validates it** (CLI uses HTTP Basic Auth, Inbound API uses
|
|
the `sbk_` ApiKeys verifier, the cookie path uses bare claims), so `JwtTokenService`'s entire
|
|
`ValidateToken`/`RefreshToken`/`RecordActivity`/`ShouldRefresh`/`IsIdleTimedOut` surface is
|
|
dead code minting an orphaned credential. **Security-023** — `LdapGroupMapping.Role` accepts
|
|
arbitrary free strings via CLI/API with no canonical-set validation, and the Central UI
|
|
`RequireClaim` policies are case-sensitive while the ManagementActor authz check is
|
|
case-insensitive, so a mis-cased/typo'd mapping silently grants inconsistent access across
|
|
the two surfaces with no operator feedback. **Security-024** — the M7 `RequireOperator`/
|
|
`RequireVerifier` SoD policies have no functional `AuthorizeAsync` grant/deny test (only
|
|
constant-name assertions), unlike every other policy. **Security-025** (Low) — a unit test
|
|
makes a live network connect to `nonexistent.invalid:9999`.
|
|
|
|
_Re-review (2026-06-20, `4307c381`):_
|
|
|
|
| # | Category | Examined | Notes |
|
|
|---|----------|----------|-------|
|
|
| 1 | Correctness & logic bugs | ☑ | `RoleMapper` union semantics (Security-016 fix) hold and read correctly. `LdapGroupMapping.Role` is an unvalidated free string + case-sensitive UI policy vs case-insensitive ManagementActor authz (Security-023). |
|
|
| 2 | Akka.NET conventions | ☑ | No actors. `AddSecurityActors` is still a placeholder. No issues. |
|
|
| 3 | Concurrency & thread safety | ☑ | Services stateless/DI-scoped; `CookieSessionValidator` reads injected `TimeProvider`. LDAP now in the external library. No shared mutable state. No issues. |
|
|
| 4 | Error handling & resilience | ☑ | `CookieSessionValidator` fails closed on idle (missing anchor = timed out) and swallows refresh faults keeping current roles (mirrors "LDAP failure: active sessions continue") — correct and well tested. `LdapAuthFailureMessages` maps the library's structured failure enum to stable user text. No issues. |
|
|
| 5 | Security | ☑ | Cookie hardening (HttpOnly/SameSite=Strict/Secure-when-required/sliding) comes from `ZbCookieDefaults` — verified correct. JWT mint binds issuer/audience, pins `MapInbound/OutboundClaims=false`, fails fast on short key. BUT the minted bearer JWT is validated by no consumer (Security-022). DisableLogin auto-login is gated at registration + loud warning + doc'd (acceptable). |
|
|
| 6 | Performance & resource management | ☑ | Security-008 N+1 remains correctly Deferred (gated on `ISecurityRepository`). `CookieSessionValidator` deliberately avoids re-minting on the no-refresh path. No new perf issues. |
|
|
| 7 | Design-document adherence | ☑ | The cookie+JWT-hybrid design is now cookie-with-bare-claims; the JWT half is orphaned/unvalidated (Security-022) — the M2.19 doc note acknowledges bare-claim cookies but the doc still presents JWT-in-cookie as the model and does not flag that `/auth/token`'s JWT is unconsumed. Operator/Verifier policies match the design. |
|
|
| 8 | Code organization & conventions | ☑ | `Roles`/`SecurityOptions` owned by the component; repository interface in Commons; additive `IGroupRoleMapper<string>` adapter. Dead JWT methods (Security-022) are the main org smell. |
|
|
| 9 | Testing coverage | ☑ | `CookieSessionValidator`, `SessionClaimBuilder`, `RoleMapper` union, cookie wiring, `SecurityOptionsValidator`, DisableLogin are well covered. Gaps: no functional authz test for `RequireOperator`/`RequireVerifier` (Security-024); a unit test does live network I/O (Security-025); no test for a non-canonical/mis-cased mapping role (Security-023). |
|
|
| 10 | Documentation & comments | ☑ | XML docs are thorough and current. `SecurityOptions` and `JwtTokenService` docs still describe an active JWT-refresh lifecycle that no longer has a caller (Security-022). |
|
|
|
|
## Checklist coverage
|
|
|
|
| # | Category | Examined | Notes |
|
|
|---|----------|----------|-------|
|
|
| 1 | Correctness & logic bugs | ☑ | `uid=`/`cn=` attribute mismatch between search filter and fallback DN construction (Security-004); StartTLS branch is unreachable (Security-001). |
|
|
| 2 | Akka.NET conventions | ☑ | No actors in this module — `AddSecurityActors` is an empty placeholder. Nothing to assess. |
|
|
| 3 | Concurrency & thread safety | ☑ | Services are stateless and DI-scoped; LDAP sync calls wrapped in `Task.Run`. No shared mutable state. No issues found. |
|
|
| 4 | Error handling & resilience | ☑ | LDAP failure paths return structured `LdapAuthResult`; group-lookup failure is tolerated per design. `ct` not honored inside `Task.Run` bodies (Security-009). |
|
|
| 5 | Security | ☑ | StartTLS dead code (Security-001), cookie not `Secure` (Security-002), JWT key unvalidated (Security-003), DN injection (Security-005), no issuer/audience validation (Security-006), idle-timeout reset on refresh (Security-007). |
|
|
| 6 | Performance & resource management | ☑ | N+1 scope-rule query in `RoleMapper` (Security-008). `LdapConnection` correctly disposed via `using`. |
|
|
| 7 | Design-document adherence | ☑ | StartTLS unsupported and Secure cookie missing both contradict the design doc; design also says "Windows Integrated Authentication" in Responsibilities, contradicting its own Authentication section (Security-010). |
|
|
| 8 | Code organization & conventions | ☑ | `SecurityOptions` correctly owned by the component; repository interface in Commons. No issues found. |
|
|
| 9 | Testing coverage | ☑ | No tests for `RoleMapper` N+1 behavior, DN-injection inputs, StartTLS path, or idle-timeout-after-refresh. Insecure-config combinations under-tested (Security-011). |
|
|
| 10 | Documentation & comments | ☑ | `SecurityOptions` XML docs say direct bind uses `cn={username}` while the search filter uses `uid=` — comment is misleading (covered under Security-004). |
|
|
|
|
_Re-review (2026-05-28, `1eb6e97`):_
|
|
|
|
| # | Category | Examined | Notes |
|
|
|---|----------|----------|-------|
|
|
| 1 | Correctness & logic bugs | ☑ | `RoleMapper` drops a system-wide Deployment grant when the user is also in any site-scoped Deployment group (Security-016); hard-coded role-name string `"Deployment"` in two separate places allows a refactor to silently break site scoping (Security-018). |
|
|
| 2 | Akka.NET conventions | ☑ | No actors. `AddSecurityActors` is still a registration placeholder. No issues. |
|
|
| 3 | Concurrency & thread safety | ☑ | Services stateless; LDAP sync calls wrapped in `Task.Run` with the now-bounded timeout (Security-009 resolution holds). No issues found. |
|
|
| 4 | Error handling & resilience | ☑ | A service-account-rebind failure inside `AuthenticateAsync` is reported as "Invalid username or password", masking a misconfiguration as a user-credential error (Security-019). LDAP-failure rule + partial-outage path remain correctly enforced post-Security-012. |
|
|
| 5 | Security | ☑ | `SiteScopeRequirement` / `SiteScopeAuthorizationHandler` are dead code — no policy is registered that uses them and no production caller instantiates them, so declarative `[Authorize]` does not enforce site scoping (Security-017, cross-module partner of CentralUI-028). `RequireHttpsCookie=false` dev opt-out has no warning path — a production misconfiguration silently transmits the JWT bearer credential over HTTP (Security-021). |
|
|
| 6 | Performance & resource management | ☑ | Security-008 N+1 remains correctly Deferred (still gated on `ISecurityRepository`). No new perf issues. |
|
|
| 7 | Design-document adherence | ☑ | `RoleMapper`'s drop-system-wide-on-any-scoped behaviour (Security-016) contradicts the design's "A user can hold multiple roles simultaneously … roles are independent — there is no implied hierarchy" rule for the union case; `SiteScopeRequirement` advertises a site-scope authorization pattern the implementation does not actually wire up (Security-017). |
|
|
| 8 | Code organization & conventions | ☑ | Role-name strings are duplicated as magic literals across `RoleMapper.cs`, `SiteScopeAuthorizationHandler.cs`, and `AuthorizationPolicies.cs` — only the audit roles have a single source of truth via `OperationalAuditRoles` / `AuditExportRoles` (Security-018). `SecurityOptions` defaults pass through to runtime with no `IValidateOptions` for required fields like `LdapServer` / `LdapSearchBase` (Security-020). |
|
|
| 9 | Testing coverage | ☑ | No test covers a user mapped to both a system-wide AND a site-scoped Deployment LDAP group (the Security-016 case). No test covers the `SiteScopeRequirement` cross-page integration — tests evaluate the handler in isolation, not the absence of a policy that uses it (Security-017). |
|
|
| 10 | Documentation & comments | ☑ | `SiteScopeAuthorizationHandler` XML doc describes a permission model no caller actually invokes (Security-017). Otherwise stable. |
|
|
|
|
## Findings
|
|
|
|
### Security-001 — StartTLS upgrade path is unreachable dead code
|
|
|
|
| | |
|
|
|--|--|
|
|
| Severity | High |
|
|
| Category | Security |
|
|
| Status | Resolved |
|
|
| Location | `src/ZB.MOM.WW.ScadaBridge.Security/LdapAuthService.cs:37-47` |
|
|
|
|
**Description**
|
|
|
|
When `LdapUseTls` is true the code sets `connection.SecureSocketLayer = true` (LDAPS).
|
|
The subsequent StartTLS block is guarded by `if (_options.LdapUseTls && !connection.SecureSocketLayer)`.
|
|
Because `SecureSocketLayer` was just set to `true`, the second condition `!connection.SecureSocketLayer`
|
|
is always false, so `connection.StartTls()` is never called. The design doc explicitly
|
|
states LDAP connections must use **"LDAPS (port 636) or StartTLS"** — StartTLS is in
|
|
practice unsupported. A deployment that intends to use StartTLS on port 389 would get a
|
|
plaintext LDAPS-mode connection attempt that fails, or worse, an operator may disable
|
|
TLS entirely to make it work, sending credentials in cleartext.
|
|
|
|
**Recommendation**
|
|
|
|
Introduce an explicit transport mode (e.g. `LdapTransport { Ldaps, StartTls, None }`)
|
|
or a separate `LdapUseStartTls` flag. For StartTLS, leave `SecureSocketLayer` false,
|
|
call `connection.Connect`, then call `connection.StartTls()` and verify the negotiated
|
|
session is encrypted before binding. Remove the unreachable conditional.
|
|
|
|
**Resolution**
|
|
|
|
Resolved 2026-05-16 (commit `<pending>`). Added an explicit `LdapTransport` enum
|
|
(`Ldaps`/`StartTls`/`None`); `SecureSocketLayer` is set only for LDAPS, and the
|
|
StartTLS branch now connects in plaintext, calls `StartTls()`, and verifies
|
|
`connection.Tls` before binding. `LdapUseTls` is retained as a compatibility shim
|
|
mapping onto the enum. Regression tests `AuthenticateAsync_StartTlsTransport_AttemptsConnection`
|
|
and `AuthenticateAsync_NoTlsTransport_RejectedWithoutAllowInsecure`.
|
|
|
|
### Security-002 — Authentication cookie is not marked `Secure`
|
|
|
|
| | |
|
|
|--|--|
|
|
| Severity | High |
|
|
| Category | Security |
|
|
| Status | Resolved |
|
|
| Location | `src/ZB.MOM.WW.ScadaBridge.Security/ServiceCollectionExtensions.cs:16-23` |
|
|
|
|
**Description**
|
|
|
|
`AddCookie` sets `HttpOnly = true` and `SameSite = Strict` but never sets
|
|
`options.Cookie.SecurePolicy`. The ASP.NET Core default is `CookieSecurePolicy.SameAsRequest`,
|
|
which permits the cookie (carrying the embedded JWT — a bearer credential) to be sent
|
|
over plain HTTP. The design doc states the cookie is **"HttpOnly and Secure (requires
|
|
HTTPS)"**. As written, the module does not enforce that requirement; a misconfigured or
|
|
HTTP-fronted deployment would transmit the session token in cleartext.
|
|
|
|
**Recommendation**
|
|
|
|
Set `options.Cookie.SecurePolicy = CookieSecurePolicy.Always` in `AddCookie`. Consider
|
|
also setting `ExpireTimeSpan` and `SlidingExpiration` to align the cookie lifetime with
|
|
the documented 15-minute JWT / 30-minute idle policy.
|
|
|
|
**Resolution**
|
|
|
|
Resolved 2026-05-16 (commit `<pending>`). Confirmed the cookie is configured in this
|
|
module (`ServiceCollectionExtensions.AddSecurity`), not CentralUI — no misattribution.
|
|
Added `options.Cookie.SecurePolicy = CookieSecurePolicy.Always` so the JWT-bearing
|
|
cookie is never sent over plain HTTP. Regression test
|
|
`AddSecurity_AuthCookie_IsMarkedSecureAlways`. (`ExpireTimeSpan`/`SlidingExpiration`
|
|
tuning left as a separate, lower-priority improvement.)
|
|
|
|
### Security-003 — JWT signing key length is never validated
|
|
|
|
| | |
|
|
|--|--|
|
|
| Severity | High |
|
|
| Category | Security |
|
|
| Status | Resolved |
|
|
| Location | `src/ZB.MOM.WW.ScadaBridge.Security/JwtTokenService.cs:33`, `src/ZB.MOM.WW.ScadaBridge.Security/SecurityOptions.cs:42` |
|
|
|
|
**Description**
|
|
|
|
`SecurityOptions.JwtSigningKey` defaults to `string.Empty` and is fed directly into
|
|
`new SymmetricSecurityKey(Encoding.UTF8.GetBytes(_options.JwtSigningKey))` with no
|
|
validation. HMAC-SHA256 requires a key of at least 256 bits (32 bytes); a short or empty
|
|
key produces a trivially forgeable token. The `SecurityHardeningTests` comment claims a
|
|
minimum length is "enforced", but no code in this module enforces it — the test only
|
|
asserts that a 32+ char key works. A deployment with a missing or short `JwtSigningKey`
|
|
would start successfully and issue weakly-signed tokens.
|
|
|
|
**Recommendation**
|
|
|
|
Validate `JwtSigningKey` at startup — fail fast if it is empty or shorter than 32 bytes.
|
|
Use an `IValidateOptions<SecurityOptions>` validator or guard in the `JwtTokenService`
|
|
constructor so a weak key is rejected before any token is issued.
|
|
|
|
**Resolution**
|
|
|
|
Resolved 2026-05-16 (commit `<pending>`). The `JwtTokenService` constructor now
|
|
fails fast with an `InvalidOperationException` when `JwtSigningKey` is empty or
|
|
shorter than 32 bytes (`SecurityOptions.MinJwtSigningKeyBytes`), so a weak key is
|
|
rejected before any token is issued. The misleading `SecurityOptions` XML doc was
|
|
corrected to state the requirement. Regression tests
|
|
`JwtTokenService_EmptySigningKey_ThrowsAtConstruction`,
|
|
`JwtTokenService_ShortSigningKey_ThrowsAtConstruction`, and
|
|
`JwtTokenService_AdequateSigningKey_ConstructsSuccessfully`.
|
|
|
|
### Security-004 — Search filter uses `uid=` while fallback DN construction uses `cn=`
|
|
|
|
| | |
|
|
|--|--|
|
|
| Severity | Medium |
|
|
| Category | Correctness & logic bugs |
|
|
| Status | Resolved |
|
|
| Location | `src/ZB.MOM.WW.ScadaBridge.Security/LdapAuthService.cs:66`, `:138`, `:157-159` |
|
|
|
|
**Description**
|
|
|
|
`AuthenticateAsync` and `ResolveUserDnAsync` build the search filter as
|
|
`(uid={username})`, but the no-service-account fallback in `ResolveUserDnAsync`
|
|
constructs the bind DN as `cn={username},{LdapSearchBase}`. The `SecurityOptions.LdapServiceAccountDn`
|
|
XML comment also documents the fallback as `cn={username},{LdapSearchBase}`. A directory
|
|
keyed on `uid` will succeed via search-then-bind but fail via the direct-bind fallback
|
|
(and vice versa). The attribute used for lookup is hard-coded and inconsistent across
|
|
the two code paths, so the two configuration modes are not interchangeable.
|
|
|
|
**Recommendation**
|
|
|
|
Introduce a single configurable `LdapUserIdAttribute` (default `uid`) and use it
|
|
consistently in both the search filter and the fallback DN. Update the XML doc to match.
|
|
|
|
**Resolution**
|
|
|
|
Resolved 2026-05-16 (commit `pending`). Confirmed: the search filter was hard-coded
|
|
`(uid={username})` (both in `AuthenticateAsync` and `ResolveUserDnAsync`) while the
|
|
fallback DN used `cn={username}` — the two auth modes were not interchangeable. Added
|
|
a configurable `SecurityOptions.LdapUserIdAttribute` (default `uid`) used for both the
|
|
search filter and the fallback DN via the new `BuildFallbackUserDn` helper, and
|
|
corrected the `LdapServiceAccountDn` XML doc to reference `{LdapUserIdAttribute}`.
|
|
Regression tests `BuildFallbackUserDn_UsesConfiguredUserIdAttribute`,
|
|
`BuildFallbackUserDn_HonoursNonDefaultUserIdAttribute`,
|
|
`SecurityOptions_LdapUserIdAttribute_DefaultsToUid`.
|
|
|
|
### Security-005 — DN injection in the no-service-account bind fallback
|
|
|
|
| | |
|
|
|--|--|
|
|
| Severity | Medium |
|
|
| Category | Security |
|
|
| Status | Resolved |
|
|
| Location | `src/ZB.MOM.WW.ScadaBridge.Security/LdapAuthService.cs:157-159` |
|
|
|
|
**Description**
|
|
|
|
When no service account is configured, the user-supplied `username` is interpolated
|
|
directly into a distinguished name: `$"cn={username},{LdapSearchBase}"`. `EscapeLdapFilter`
|
|
escapes *search-filter* metacharacters, but DN construction requires a different
|
|
escaping scheme (RFC 4514 — `,`, `+`, `"`, `\`, `<`, `>`, `;`, leading/trailing spaces).
|
|
No DN escaping is applied here. A username such as `victim,ou=admins` alters the DN
|
|
structure, allowing a caller to attempt a bind as a different DN than intended. Combined
|
|
with the `username.Contains('=')` shortcut at line 129 — which lets a caller supply a
|
|
full arbitrary DN — the fallback path gives the client undue control over the bind
|
|
identity.
|
|
|
|
**Recommendation**
|
|
|
|
Apply RFC 4514 DN-component escaping to `username` before interpolation, or use the
|
|
LDAP library's DN-builder API. Reconsider the `Contains('=')` shortcut — accepting a
|
|
raw DN from untrusted input is risky; restrict it or remove it.
|
|
|
|
**Resolution**
|
|
|
|
Resolved 2026-05-16 (commit `pending`). Confirmed: the fallback path interpolated
|
|
the raw `username` straight into `cn={username},{LdapSearchBase}` with no DN escaping,
|
|
and the `username.Contains('=')` shortcut let a caller supply an arbitrary bind DN.
|
|
Added an RFC 4514 `EscapeLdapDn` helper (escapes `, + " \ < > ;`, leading/trailing
|
|
space, leading `#`, NUL) applied in `BuildFallbackUserDn`, so a username such as
|
|
`victim,ou=admins` can no longer alter the DN structure. The `Contains('=')` raw-DN
|
|
shortcut was removed entirely — untrusted input no longer controls the bind identity.
|
|
Regression tests `BuildFallbackUserDn_EscapesDnMetacharacters`,
|
|
`EscapeLdapDn_EscapesAllRfc4514Specials`, `EscapeLdapDn_EscapesLeadingAndTrailingSpaces`.
|
|
|
|
### Security-006 — JWT validation disables issuer and audience checks
|
|
|
|
| | |
|
|
|--|--|
|
|
| Severity | Medium |
|
|
| Category | Security |
|
|
| Status | Resolved |
|
|
| Location | `src/ZB.MOM.WW.ScadaBridge.Security/JwtTokenService.cs:67-75`, `:56-59` |
|
|
|
|
**Description**
|
|
|
|
`ValidateToken` sets `ValidateIssuer = false` and `ValidateAudience = false`, and
|
|
`GenerateToken` never sets an `iss` or `aud`. With a shared symmetric HMAC key, any
|
|
other system or component that signs JWTs with the same key would produce tokens this
|
|
service accepts. While the design states the key is shared only between the two central
|
|
nodes, omitting issuer/audience binding removes a cheap defense-in-depth control and
|
|
makes accidental key reuse (e.g. the same secret used for another internal token)
|
|
silently exploitable.
|
|
|
|
**Recommendation**
|
|
|
|
Set a fixed `Issuer` and `Audience` (e.g. `"scadabridge-central"`) when generating tokens
|
|
and enable `ValidateIssuer`/`ValidateAudience` with the matching expected values during
|
|
validation.
|
|
|
|
**Resolution**
|
|
|
|
Resolved 2026-05-16 (commit `pending`). Confirmed: `GenerateToken` set neither `iss`
|
|
nor `aud` and `ValidateToken` had `ValidateIssuer = false`/`ValidateAudience = false`.
|
|
`GenerateToken` now binds `JwtTokenService.TokenIssuer`/`TokenAudience`
|
|
(both `"scadabridge-central"`) into every token, and `ValidateToken` enables
|
|
`ValidateIssuer`/`ValidateAudience` against those fixed values — a token signed with
|
|
the shared key but a foreign issuer is now rejected. Regression tests
|
|
`GenerateToken_SetsIssuerAndAudience`, `ValidateToken_RejectsTokenWithWrongIssuer`,
|
|
`ValidateToken_AcceptsOwnIssuerAndAudience`.
|
|
|
|
### Security-007 — Idle-timeout claim is reset on every token refresh
|
|
|
|
| | |
|
|
|--|--|
|
|
| Severity | Medium |
|
|
| Category | Correctness & logic bugs |
|
|
| Status | Resolved |
|
|
| Location | `src/ZB.MOM.WW.ScadaBridge.Security/JwtTokenService.cs:40`, `:111-123` |
|
|
|
|
**Description**
|
|
|
|
The design states the 30-minute idle timeout is tracked via a "last-activity timestamp
|
|
in the token", and `IsIdleTimedOut` reads the `LastActivity` claim. But `RefreshToken`
|
|
calls `GenerateToken`, which unconditionally writes `LastActivity = DateTimeOffset.UtcNow`.
|
|
Token refresh fires whenever a request arrives within ~5 minutes of expiry. The result
|
|
is that `LastActivity` reflects *token issuance time*, not genuine user activity — and
|
|
since refresh itself is a request, the timestamp keeps moving forward. A more subtle
|
|
consequence: the idle window is effectively measured from the last refresh, not the
|
|
last real interaction, so the documented "no requests within the idle window" semantics
|
|
are not faithfully implemented. The claim name `LastActivity` is also misleading.
|
|
|
|
**Recommendation**
|
|
|
|
Decide explicitly how activity is tracked. Either (a) carry the original `LastActivity`
|
|
forward across refreshes and update it only on real request handling in the middleware,
|
|
or (b) rename the claim to `IssuedAt`/`TokenCreated` and document that the idle window
|
|
is measured from issuance. Whichever is chosen, ensure `IsIdleTimedOut` and the refresh
|
|
path agree on the semantics.
|
|
|
|
**Resolution**
|
|
|
|
Resolved 2026-05-16 (commit `pending`). Confirmed: `RefreshToken` → `GenerateToken`
|
|
unconditionally wrote `LastActivity = UtcNow`, so the idle clock reset on every
|
|
refresh and the documented 30-minute idle timeout could never fire for a client that
|
|
polls in the background. Implemented option (a) — the Security-side half of the
|
|
documented "15-min sliding + 30-min idle" policy (the cross-module partner of
|
|
CentralUI-005): `GenerateToken` now takes an optional `lastActivity` anchor;
|
|
`RefreshToken` carries the **existing** `LastActivity` claim forward unchanged, and a
|
|
new explicit `RecordActivity` method advances the anchor to now — to be called by the
|
|
CentralUI request pipeline on genuine user interaction (not on a background refresh).
|
|
`IsIdleTimedOut` is unchanged and now agrees with the refresh path. The remaining
|
|
CentralUI-side wiring (calling `RecordActivity` from the middleware, setting
|
|
`SlidingExpiration`/`ExpireTimeSpan`) stays tracked under CentralUI-005; this finding's
|
|
Security-side defect — the reset-on-refresh bug — is fully fixed here. Regression tests
|
|
`RefreshToken_PreservesOriginalLastActivityClaim`,
|
|
`RefreshToken_DoesNotResetIdleTimeoutWhenUserIsActuallyIdle`,
|
|
`RecordActivity_UpdatesLastActivityToNow`.
|
|
|
|
### Security-008 — N+1 query loading site-scope rules in `RoleMapper`
|
|
|
|
| | |
|
|
|--|--|
|
|
| Severity | Low |
|
|
| Category | Performance & resource management |
|
|
| Status | Deferred |
|
|
| Location | `src/ZB.MOM.WW.ScadaBridge.Security/RoleMapper.cs:25-48` |
|
|
|
|
**Description**
|
|
|
|
`MapGroupsToRolesAsync` first calls `GetAllMappingsAsync`, then inside the per-mapping
|
|
loop calls `GetScopeRulesForMappingAsync(mapping.Id, ct)` once for every matched
|
|
Deployment mapping. This is an N+1 query pattern executed on the login hot path and on
|
|
every 15-minute token refresh. With multiple site-scoped Deployment groups it issues a
|
|
round-trip per group.
|
|
|
|
**Recommendation**
|
|
|
|
Add a repository method that loads scope rules for a set of mapping IDs in one query
|
|
(or eager-loads them with the mappings), and resolve all scope rules with a single call.
|
|
|
|
**Resolution**
|
|
|
|
Deferred 2026-05-16 (commit `pending commit`). Finding confirmed accurate: `RoleMapper.MapGroupsToRolesAsync`
|
|
issues one `GetScopeRulesForMappingAsync` round-trip per matched Deployment mapping — a
|
|
genuine N+1 on the login / 15-minute-refresh path. However, the only correct fix
|
|
(a batch `GetScopeRulesForMappingsAsync(IEnumerable<int>)` repository method, or an
|
|
eager-load navigation property) requires changes to `ISecurityRepository`
|
|
(`src/ZB.MOM.WW.ScadaBridge.Commons`) and `SecurityRepository` (`src/ZB.MOM.WW.ScadaBridge.ConfigurationDatabase`).
|
|
Both are outside the Security module's permitted edit scope for this review pass, and the
|
|
existing `ISecurityRepository` surface offers no per-set scope-rule query, so the N+1
|
|
cannot be removed from within `RoleMapper.cs` alone. Severity is Low (bounded by the
|
|
number of site-scoped Deployment groups, typically small). Deferred to a cross-module
|
|
change that adds the batch repository method; `RoleMapper` should then resolve all scope
|
|
rules in a single call.
|
|
|
|
### Security-009 — CancellationToken not honored inside `Task.Run` LDAP calls
|
|
|
|
| | |
|
|
|--|--|
|
|
| Severity | Low |
|
|
| Category | Error handling & resilience |
|
|
| Status | Resolved |
|
|
| Location | `src/ZB.MOM.WW.ScadaBridge.Security/LdapAuthService.cs:42`, `:46`, `:51`, `:56-57`, `:67-73`, `:135`, `:139-145` |
|
|
|
|
**Description**
|
|
|
|
The synchronous Novell LDAP calls are wrapped in `Task.Run(() => ..., ct)`. The `ct`
|
|
argument only prevents the work item from *starting* if cancellation is already
|
|
signaled; once a `connection.Connect`/`Bind`/`Search` call is in progress it cannot be
|
|
cancelled. A cancelled or timed-out login request will continue to occupy a thread-pool
|
|
thread and an LDAP connection until the blocking call returns on its own. There is also
|
|
no explicit network/operation timeout configured on the `LdapConnection`.
|
|
|
|
**Recommendation**
|
|
|
|
Configure `LdapConnection.ConnectionTimeout` and search/operation time limits so a
|
|
hung LDAP server cannot pin a thread indefinitely. Document that `ct` only guards
|
|
work-item scheduling, or implement a timeout-with-disconnect fallback.
|
|
|
|
**Resolution**
|
|
|
|
Resolved 2026-05-16 (commit `pending commit`). Confirmed: the synchronous Novell LDAP
|
|
calls were wrapped in `Task.Run(..., ct)` where `ct` only prevents the work item from
|
|
starting and cannot interrupt an in-progress blocking `Connect`/`Bind`/`Search`, and no
|
|
network/operation timeout was configured on the `LdapConnection`. Added a configurable
|
|
`SecurityOptions.LdapConnectionTimeoutMs` (default 10s) and a new `ApplyConnectionTimeout`
|
|
helper that sets both `LdapConnection.ConnectionTimeout` (socket connect) and
|
|
`LdapConstraints.TimeLimit` (per-operation limit) before connecting, so a hung LDAP
|
|
server can no longer pin a thread-pool thread indefinitely. The `ct`-only-guards-scheduling
|
|
limitation is now documented in the option's XML doc and an inline comment. Regression
|
|
tests `SecurityOptions_LdapConnectionTimeout_HasSaneDefault` and
|
|
`AuthenticateAsync_UnreachableHost_FailsWithinConfiguredTimeout`.
|
|
|
|
### Security-010 — Design doc contradicts itself on Windows Integrated Authentication
|
|
|
|
| | |
|
|
|--|--|
|
|
| Severity | Low |
|
|
| Category | Design-document adherence |
|
|
| Status | Resolved |
|
|
| Location | `docs/requirements/Component-Security.md:13` (vs. `:23`) |
|
|
|
|
**Description**
|
|
|
|
The Responsibilities section states the component authenticates "using Windows
|
|
Integrated Authentication", but the Authentication section (line 23) and CLAUDE.md
|
|
explicitly state **"No Windows Integrated Authentication ... authenticates directly
|
|
against LDAP/AD, not via Kerberos/NTLM"** — which is what the code actually does
|
|
(direct LDAP bind). The Responsibilities line is stale and contradicts both the rest of
|
|
the doc and the implementation.
|
|
|
|
**Recommendation**
|
|
|
|
Fix `Component-Security.md:13` to say "using a direct LDAP/Active Directory bind"
|
|
to match the implemented behavior and the rest of the document.
|
|
|
|
**Resolution**
|
|
|
|
Resolved 2026-05-16 (commit `pending commit`). Confirmed: the Responsibilities bullet at
|
|
line 13 said "using Windows Integrated Authentication", directly contradicting the
|
|
Authentication section's "No Windows Integrated Authentication ... authenticates directly
|
|
against LDAP/AD, not via Kerberos/NTLM", CLAUDE.md, and the implementation (`LdapAuthService`
|
|
performs a direct username/password bind). Documentation-only finding — no regression test
|
|
is meaningful. Reworded the Responsibilities bullet to "Authenticate users against
|
|
LDAP/Active Directory using a direct LDAP/AD bind (username/password)", matching the rest
|
|
of the document and the code.
|
|
|
|
### Security-011 — Missing tests for security-critical paths
|
|
|
|
| | |
|
|
|--|--|
|
|
| Severity | Low |
|
|
| Category | Testing coverage |
|
|
| Status | Resolved |
|
|
| Location | `tests/ZB.MOM.WW.ScadaBridge.Security.Tests/UnitTest1.cs` |
|
|
|
|
**Description**
|
|
|
|
The test suite covers happy paths well but omits several security-relevant cases:
|
|
no test exercises the StartTLS path (Security-001), the DN-injection / `Contains('=')`
|
|
fallback inputs (Security-005), JWT validation with a too-short or empty signing key
|
|
(Security-003), `IsIdleTimedOut` returning true after a token has been refreshed
|
|
(Security-007), or the `uid`/`cn` mismatch in the no-service-account path (Security-004).
|
|
The integration `SecurityHardeningTests` only asserts default option values, not
|
|
enforcement. The test file is still named `UnitTest1.cs`.
|
|
|
|
**Recommendation**
|
|
|
|
Add negative/edge-case tests for the items above, particularly key-length rejection,
|
|
DN-escaping of hostile usernames, and idle-timeout behavior across a refresh. Rename
|
|
`UnitTest1.cs` to a descriptive name.
|
|
|
|
**Resolution**
|
|
|
|
Resolved 2026-05-16 (commit `pending commit`). Re-triage: most of the listed gaps were
|
|
already closed by the Security-001..007 fixes — the regression classes
|
|
`SecurityReviewRegressionTests` (StartTLS path, JWT empty/short-key rejection) and
|
|
`SecurityReviewRegressionTests2` (DN-injection / hostile-username escaping, `uid`/`cn`
|
|
fallback consistency, idle-timeout preserved across refresh) cover them. The finding's
|
|
reference to a `SecurityHardeningTests` class is stale — no such class exists in the
|
|
suite. Remaining gaps closed here: renamed the test file `UnitTest1.cs` →
|
|
`SecurityTests.cs`, and added a new `SecurityReviewRegressionTests3` class with the
|
|
LDAP-timeout coverage (Security-009) plus extra no-service-account / DN-path edge cases
|
|
(`BuildFallbackUserDn_NoSearchBase_ReturnsBareRdn`, `EscapeLdapDn_LeadingHash_IsEscaped`,
|
|
`EscapeLdapDn_NullOrEmpty_ReturnedUnchanged`). Full module suite: 54 tests, all green.
|
|
|
|
### Security-012 — Partial LDAP failure during login yields a roleless authenticated session
|
|
|
|
| | |
|
|
|--|--|
|
|
| Severity | Medium |
|
|
| Category | Error handling & resilience |
|
|
| Status | Resolved |
|
|
| Location | `src/ZB.MOM.WW.ScadaBridge.Security/LdapAuthService.cs:78-118` |
|
|
|
|
**Description**
|
|
|
|
In `AuthenticateAsync`, after the user bind succeeds, the group/attribute `Search` is
|
|
wrapped in a `try`/`catch (LdapException)` that logs a warning and continues — the
|
|
comment states "Auth succeeded even if attribute lookup failed". The method then returns
|
|
`new LdapAuthResult(true, displayName, username, groups, null)` with an **empty
|
|
`groups`** list. Because `RoleMapper.MapGroupsToRolesAsync` derives all roles from that
|
|
group list, a login during a *partial* LDAP outage (bind reachable, search/group server
|
|
unavailable, or a transient `LdapException` mid-search) produces a fully authenticated
|
|
session with **zero roles** — the user is logged in but silently stripped of all
|
|
permissions. The design's LDAP Connection Failure rule states new logins must **fail**
|
|
when LDAP is unavailable; this path instead admits the user. The caller cannot
|
|
distinguish "user genuinely belongs to no mapped groups" from "the group query failed",
|
|
so it cannot make the documented fail-the-login decision. The `while` loop's inner
|
|
`catch (LdapException) { break; }` (line 107-111) compounds this: a real error mid-enumeration
|
|
is indistinguishable from end-of-results and silently truncates the group list.
|
|
|
|
**Recommendation**
|
|
|
|
Treat a failed group/attribute lookup on the *initial login* as an authentication
|
|
failure: return `Success = false` with a "directory unavailable, try again" message, or
|
|
add an explicit flag on `LdapAuthResult` (e.g. `GroupLookupSucceeded`) so the caller can
|
|
apply the design's fail-the-login rule. The inner `while`-loop `catch` should not treat
|
|
an arbitrary `LdapException` as a benign end-of-results sentinel — only the specific
|
|
"no more results" condition should break the loop; any other exception should propagate
|
|
or be surfaced.
|
|
|
|
**Resolution**
|
|
|
|
Resolved 2026-05-17 (commit `pending`). Confirmed: the group/attribute `Search` was
|
|
wrapped in a `catch (LdapException)` that logged a warning and returned
|
|
`new LdapAuthResult(true, …, groups: [])` — a partial LDAP outage produced an
|
|
authenticated, zero-role session — and the inner `while`-loop `catch { break; }` masked
|
|
real mid-enumeration errors as end-of-results. `AuthenticateAsync` now tracks a
|
|
`groupLookupSucceeded` flag (false when any `LdapException` is thrown by the search or
|
|
`Next()`); the inner swallowing catch was removed so such errors propagate; the new
|
|
`BuildAuthResultFromGroupLookup` helper returns a FAILED `LdapAuthResult`
|
|
("directory temporarily unavailable, please try again") on lookup failure while still
|
|
treating a genuine empty-groups result as a successful login. Regression tests
|
|
`BuildAuthResultFromGroupLookup_LookupFailed_FailsTheLogin`,
|
|
`BuildAuthResultFromGroupLookup_LookupSucceededNoGroups_IsAuthenticated`,
|
|
`BuildAuthResultFromGroupLookup_LookupSucceededWithGroups_CarriesGroups`.
|
|
|
|
### Security-013 — `ExtractFirstRdnValue` mis-parses group DNs containing escaped commas
|
|
|
|
| | |
|
|
|--|--|
|
|
| Severity | Low |
|
|
| Category | Correctness & logic bugs |
|
|
| Status | Resolved |
|
|
| Location | `src/ZB.MOM.WW.ScadaBridge.Security/LdapAuthService.cs:258-269` |
|
|
|
|
**Description**
|
|
|
|
`ExtractFirstRdnValue` extracts a group name from a `memberOf` DN by taking the substring
|
|
between the first `=` and the first `,`. RFC 4514 permits a `,` inside an RDN value when
|
|
it is backslash-escaped — e.g. `cn=Acme\, Inc Operators,ou=groups,dc=example,dc=com`. The
|
|
method splits on the *escaped* comma and returns `Acme\` as the group name. `RoleMapper`
|
|
then matches that mangled name verbatim against configured `LdapGroupMapping.LdapGroupName`
|
|
values, so a group whose CN legitimately contains a comma silently fails to map to its
|
|
role — the user loses that role with no error. The same naive parsing also does not strip
|
|
the trailing backslash escape.
|
|
|
|
**Recommendation**
|
|
|
|
Parse the first RDN with an escape-aware scan: ignore a `,` preceded by an unescaped
|
|
`\`, and unescape RFC 4514 sequences in the extracted value. Alternatively use the LDAP
|
|
library's DN-parsing API (`LdapDN` / RDN accessors) rather than hand-rolled `IndexOf`.
|
|
|
|
**Resolution**
|
|
|
|
Resolved 2026-05-17 (commit `pending`). Confirmed: `ExtractFirstRdnValue` took the
|
|
substring between the first `=` and the first `,`, splitting on an RFC 4514
|
|
backslash-escaped comma so a CN like `Acme\, Inc Operators` became `Acme\`. Replaced the
|
|
naive `IndexOf` scan with an escape-aware character scan that ignores a backslash-escaped
|
|
`,`, unescapes single-character and `\XX` hex escape sequences, and only terminates the
|
|
value on an unescaped comma — so a comma-containing group CN is returned intact and
|
|
matches its configured `LdapGroupName`. Method was also made `public` for direct
|
|
testing. Regression tests `ExtractFirstRdnValue_EscapedComma_KeepsWholeGroupName`,
|
|
`ExtractFirstRdnValue_PlainDn_ReturnsFirstRdnValue`,
|
|
`ExtractFirstRdnValue_SingleRdn_ReturnsValue`,
|
|
`ExtractFirstRdnValue_EscapedSpecials_AreUnescaped`.
|
|
|
|
### Security-014 — `RefreshToken` re-issues a token without checking the idle timeout
|
|
|
|
| | |
|
|
|--|--|
|
|
| Severity | Medium |
|
|
| Category | Security |
|
|
| Status | Resolved |
|
|
| Location | `src/ZB.MOM.WW.ScadaBridge.Security/JwtTokenService.cs:156-169` |
|
|
|
|
**Description**
|
|
|
|
`RefreshToken` carries the existing `LastActivity` claim forward (correct, per the
|
|
Security-007 fix) and unconditionally issues a fresh token with a new 15-minute expiry.
|
|
It never calls `IsIdleTimedOut`. The documented 30-minute idle policy therefore depends
|
|
entirely on the CentralUI request pipeline calling `IsIdleTimedOut` *before* calling
|
|
`RefreshToken` — two independent, order-dependent checks with no enforcement that they
|
|
are used together. If a caller refreshes without first checking idle state (an easy
|
|
mistake, and there is no compiler or API signal preventing it), an idle-expired session
|
|
is silently renewed: each refresh resets the JWT expiry while `LastActivity` keeps
|
|
ageing, so the session can be kept alive indefinitely by background refreshes even
|
|
though the user has been idle for hours. `RefreshToken` is the natural place to enforce
|
|
the invariant but provides no safety net.
|
|
|
|
**Recommendation**
|
|
|
|
Have `RefreshToken` itself reject a principal that is already past the idle timeout —
|
|
return `null` (the same "cannot refresh" signal it already uses for missing claims) when
|
|
`IsIdleTimedOut(currentPrincipal)` is true — so the idle policy holds regardless of
|
|
caller discipline. Document that an idle-expired principal cannot be refreshed and add a
|
|
regression test covering refresh of a 31-minute-idle token.
|
|
|
|
**Resolution**
|
|
|
|
Resolved 2026-05-17 (commit `pending`). Confirmed: `RefreshToken` carried `LastActivity`
|
|
forward and minted a fresh 15-minute token without ever calling `IsIdleTimedOut`, so the
|
|
idle policy depended entirely on caller discipline and a background refresh could keep an
|
|
idle-expired session alive indefinitely. `RefreshToken` now calls
|
|
`IsIdleTimedOut(currentPrincipal)` after the missing-claims check and returns `null`
|
|
(its existing "cannot refresh" signal) when the session is past the idle window, so the
|
|
30-minute idle policy holds regardless of caller order; XML doc updated to state the
|
|
invariant. Regression tests `RefreshToken_IdleExpiredPrincipal_ReturnsNull`,
|
|
`RefreshToken_ActiveSessionWithinIdleWindow_StillRefreshes`,
|
|
`RefreshToken_MissingLastActivityClaim_ReturnsNull`.
|
|
|
|
### Security-015 — Username is not trimmed before use in the LDAP filter, fallback DN, and JWT claims
|
|
|
|
| | |
|
|
|--|--|
|
|
| Severity | Low |
|
|
| Category | Correctness & logic bugs |
|
|
| Status | Resolved |
|
|
| Location | `src/ZB.MOM.WW.ScadaBridge.Security/LdapAuthService.cs:20-21`, `:80`, `:122`, `:169`, `:193` |
|
|
|
|
**Description**
|
|
|
|
`AuthenticateAsync` rejects null/whitespace-only usernames via `IsNullOrWhiteSpace`, but
|
|
an otherwise-valid username with leading or trailing whitespace (`" alice"`, copy-paste
|
|
artefacts, mobile keyboards) is passed through verbatim. It flows into the LDAP search
|
|
filter `({attr}={EscapeLdapFilter(username)})`, into the fallback bind DN via
|
|
`BuildFallbackUserDn`, and — most consequentially — into the returned `LdapAuthResult`
|
|
and from there into the JWT `Username` claim. Most LDAP directories ignore surrounding
|
|
whitespace on a search term, so `" alice"` and `"alice"` may both authenticate but yield
|
|
JWT `Username` claims that differ by whitespace. Any downstream identity comparison
|
|
(audit-log `who`, site-scope lookups keyed by username, session de-duplication) then
|
|
sees two distinct identities for the same person.
|
|
|
|
**Recommendation**
|
|
|
|
Trim the username once at the top of `AuthenticateAsync` (after the `IsNullOrWhiteSpace`
|
|
guard) and use the trimmed value consistently for the filter, the DN, and the
|
|
`LdapAuthResult`. This normalises the identity that ends up in the JWT claim and audit
|
|
trail.
|
|
|
|
**Resolution**
|
|
|
|
Resolved 2026-05-17 (commit `pending`). Confirmed: an otherwise-valid username with
|
|
leading/trailing whitespace passed the `IsNullOrWhiteSpace` guard and flowed verbatim
|
|
into the LDAP filter, the fallback DN, and the returned `LdapAuthResult` (and thus the
|
|
JWT `Username` claim). Added a `NormalizeUsername` helper (trims whitespace) called once
|
|
at the top of `AuthenticateAsync` immediately after the guard; the local `username`
|
|
variable is reassigned to the trimmed value so the filter, fallback DN, and result all
|
|
use the single canonical identity. Regression tests
|
|
`NormalizeUsername_TrimsLeadingAndTrailingWhitespace`,
|
|
`BuildFallbackUserDn_TrimmedUsername_NoLeadingTrailingSpace`,
|
|
`AuthenticateAsync_UsernameWithSurroundingWhitespace_StillRejectedForInsecure`.
|
|
|
|
### Security-016 — `RoleMapper` silently drops the system-wide Deployment grant when a user is also in any site-scoped Deployment group
|
|
|
|
| | |
|
|
|--|--|
|
|
| Severity | Medium |
|
|
| Category | Correctness & logic bugs |
|
|
| Status | Resolved |
|
|
| Location | `src/ZB.MOM.WW.ScadaBridge.Security/RoleMapper.cs:30-31`, `:41-55`, `:59` |
|
|
|
|
**Description**
|
|
|
|
`MapGroupsToRolesAsync` resolves the Deployment role's site scope as a single
|
|
`isSystemWide = hasDeploymentRole && !hasDeploymentWithScopeRules` flag computed across
|
|
ALL matched Deployment mappings. If a user is a member of both a system-wide Deployment
|
|
group (e.g. `SCADA-Deploy-All`, no scope rules) AND a site-scoped Deployment group
|
|
(e.g. `SCADA-Deploy-SiteA`, one scope rule for Site A), the second mapping sets
|
|
`hasDeploymentWithScopeRules = true`, so the final `isSystemWide` becomes `false` and
|
|
the returned `PermittedSiteIds` is just `[SiteA]`. The system-wide grant from
|
|
`SCADA-Deploy-All` is silently dropped — the user loses access to every other site, even
|
|
though one of their LDAP groups was intended to grant them system-wide reach. This
|
|
contradicts the design's "A user can hold multiple roles simultaneously … roles are
|
|
independent — there is no implied hierarchy" intent: the union of grants should be the
|
|
broadest grant in the set, not the narrowest. The mistake is also non-obvious to an
|
|
operator: from the Admin → LDAP Mappings page nothing flags that adding a site-scoped
|
|
Deployment mapping for a user already in `SCADA-Deploy-All` *removes* sites from their
|
|
effective grant. The downstream `SiteScopeService.IsSystemWideAsync` / `FilterSitesAsync`
|
|
faithfully reproduce this narrowing, so the user can no longer see or act on sites
|
|
outside `[SiteA]`.
|
|
|
|
**Recommendation**
|
|
|
|
Track the union semantics explicitly: if any matched Deployment mapping has no scope
|
|
rules, the user is system-wide regardless of what other mappings have. The simplest
|
|
change is to set `hasDeploymentWithScopeRules` only when the mapping has scope rules
|
|
AND another flag `hasUnscopedDeploymentMapping` is false; then compute
|
|
`isSystemWide = hasUnscopedDeploymentMapping || (hasDeploymentRole && !hasDeploymentWithScopeRules)`.
|
|
Equivalently: collect per-mapping `(hasRules, scopedSiteIds)` first, then
|
|
`isSystemWide = any mapping has hasRules==false`, and `permittedSiteIds = union of all
|
|
scopedSiteIds` (left empty for system-wide users). Add a regression test
|
|
`MapGroupsToRoles_UserInBothSystemWideAndScopedDeploymentGroup_IsSystemWide` covering
|
|
the design's example pair `SCADA-Deploy-All` + `SCADA-Deploy-SiteA`.
|
|
|
|
**Resolution**
|
|
|
|
Resolved 2026-05-28 (commit pending). `RoleMapper.MapGroupsToRolesAsync` now tracks two
|
|
independent flags per matched Deployment mapping: `hasUnscopedDeploymentMapping` (any
|
|
matched mapping with no scope rules) and `hasScopedDeploymentMapping` (any matched
|
|
mapping with scope rules). `isSystemWide` is `hasUnscopedDeploymentMapping ||
|
|
(hasDeploymentRole && !hasScopedDeploymentMapping)` — so a user in both
|
|
`SCADA-Deploy-All` and `SCADA-Deploy-SiteA` is now correctly system-wide, with the
|
|
accumulated scope ids cleared. Magic string `"Deployment"` was replaced with the new
|
|
`Roles.Deployment` constant (Security-018). Regression test
|
|
`MapGroupsToRoles_UserInBothSystemWideAndScopedDeploymentGroup_IsSystemWide`
|
|
(seeds Site A, attaches a scope rule to the seeded `SCADA-Deploy-SiteA` mapping, and
|
|
asserts a user mapped via both `SCADA-Deploy-All` and `SCADA-Deploy-SiteA` resolves
|
|
to system-wide with empty `PermittedSiteIds`) fails on the pre-fix code and passes
|
|
after.
|
|
|
|
### Security-017 — `SiteScopeRequirement` / `SiteScopeAuthorizationHandler` are dead code from production callers — `[Authorize(Policy = RequireDeployment)]` does NOT enforce site scoping
|
|
|
|
| | |
|
|
|--|--|
|
|
| Severity | Medium |
|
|
| Category | Design-document adherence |
|
|
| Status | Resolved |
|
|
| Location | `src/ZB.MOM.WW.ScadaBridge.Security/SiteScopeAuthorizationHandler.cs:8-58`; `src/ZB.MOM.WW.ScadaBridge.Security/AuthorizationPolicies.cs:113-143` |
|
|
|
|
**Description**
|
|
|
|
The module declares `SiteScopeRequirement` (an `IAuthorizationRequirement` carrying a
|
|
`TargetSiteId`) and the matching `SiteScopeAuthorizationHandler` that combines the
|
|
Deployment role claim with the `SiteId` claims to enforce the design's site-scoping
|
|
rule. The handler is registered in `AddScadaBridgeAuthorization`
|
|
(`services.AddSingleton<IAuthorizationHandler, SiteScopeAuthorizationHandler>()`). But
|
|
no `AddPolicy` call ever wires the requirement to a named policy, and a grep across
|
|
`src/ZB.MOM.WW.ScadaBridge.CentralUI` and `src/ZB.MOM.WW.ScadaBridge.ManagementService` confirms that **no
|
|
production code ever instantiates `new SiteScopeRequirement(...)` or calls
|
|
`AuthorizeAsync(...)` with one** — the only callers are the unit tests in
|
|
`SecurityTests.cs:1146,1166,1185,1203`. The design + CLAUDE.md state that "Deployment
|
|
and Monitoring pages must filter every site/instance list through `FilterSitesAsync`
|
|
and re-check `IsSiteAllowedAsync` before any cross-site command", and the
|
|
CentralUI-028 finding (High, Open) confirms this is exactly the contract two new
|
|
report pages forgot — because there is no declarative `[Authorize(Policy = ...)]`
|
|
shortcut, callers must remember to inject `SiteScopeService` and write the check by
|
|
hand, and any new page that forgets is a silent regression with no compile-time or
|
|
test-time signal. The module's published surface advertises an authorization-handler
|
|
pattern that is, in practice, unwired plumbing.
|
|
|
|
**Recommendation**
|
|
|
|
Either (a) **delete** `SiteScopeRequirement` and `SiteScopeAuthorizationHandler` (and
|
|
the dead `IAuthorizationHandler` registration) and document `SiteScopeService` as the
|
|
sole site-scoping mechanism — this is the smaller change and matches what the codebase
|
|
actually does today; or, preferably, (b) **finish the wiring**: add a `RequireSiteScope`
|
|
policy that uses `SiteScopeRequirement` and provide a small helper / source generator
|
|
or analyzer that flags Deployment-policy-attributed pages without a site-scope check.
|
|
Either way, address the cross-module gap: CentralUI-028 stays open until production
|
|
pages reliably enforce the rule. If (b) is chosen, a route-parameter-aware
|
|
`IAuthorizationPolicyProvider` is needed so the policy can read the target site id from
|
|
the request — that is a meaningful design extension and would need to be planned
|
|
alongside the Central UI's existing `SiteScopeService` usage rather than replacing it
|
|
piecemeal.
|
|
|
|
**Resolution**
|
|
|
|
Resolved 2026-05-28 (commit pending) per recommendation option (a): deleted
|
|
`SiteScopeRequirement` and `SiteScopeAuthorizationHandler` outright, along with the
|
|
unwired `services.AddSingleton<IAuthorizationHandler, SiteScopeAuthorizationHandler>()`
|
|
registration in `AuthorizationPolicies.AddScadaBridgeAuthorization` and the four
|
|
isolation tests in `SecurityTests.cs`. `SiteScopeService` (CentralUI-002) is now
|
|
documented as the sole site-scoping mechanism — the stale "mirrors
|
|
SiteScopeAuthorizationHandler" comment in `SiteScopeService.ResolveAsync` was rewritten
|
|
to say so. The cross-module partner CentralUI-028 is fixed in the same batch (the two
|
|
new report pages now consume `SiteScopeService`).
|
|
|
|
### Security-018 — Role names are hard-coded magic strings duplicated across `RoleMapper`, `SiteScopeAuthorizationHandler`, and `AuthorizationPolicies`
|
|
|
|
| | |
|
|
|--|--|
|
|
| Severity | Low |
|
|
| Category | Code organization & conventions |
|
|
| Status | Resolved |
|
|
| Location | `src/ZB.MOM.WW.ScadaBridge.Security/RoleMapper.cs:41`; `src/ZB.MOM.WW.ScadaBridge.Security/SiteScopeAuthorizationHandler.cs:36`; `src/ZB.MOM.WW.ScadaBridge.Security/AuthorizationPolicies.cs:118,121,124,95,107` |
|
|
|
|
**Description**
|
|
|
|
The role-name literals `"Admin"`, `"Design"`, `"Deployment"`, `"Audit"`, and
|
|
`"AuditReadOnly"` are duplicated as magic strings across three separate files:
|
|
`RoleMapper.cs:41` hard-codes `"Deployment"` to detect the site-scope branch;
|
|
`SiteScopeAuthorizationHandler.cs:36` independently hard-codes `"Deployment"` to gate
|
|
the handler; and `AuthorizationPolicies.cs:118,121,124` hard-code the four role names
|
|
as the policy `RequireClaim` values. Only the audit roles have a single source of truth
|
|
(via the `OperationalAuditRoles` / `AuditExportRoles` arrays on
|
|
`AuthorizationPolicies`). A future rename or addition of a role that misses any one of
|
|
these call sites silently breaks the system: e.g. renaming "Deployment" → "Deployer"
|
|
in `RoleMapper` alone would leave the policy still requiring `"Deployment"` (logins
|
|
get the new role name but the policy never matches), while changing it in the policy
|
|
alone would leave `RoleMapper` failing to populate scope rules for the renamed role.
|
|
The bug class is "string drift" — exactly the kind the `OperationalAuditRoles` constant
|
|
was introduced to prevent.
|
|
|
|
**Recommendation**
|
|
|
|
Introduce a `public static class Roles { public const string Admin = "Admin"; public const
|
|
string Design = "Design"; public const string Deployment = "Deployment"; public const string
|
|
Audit = "Audit"; public const string AuditReadOnly = "AuditReadOnly"; }` in the Security
|
|
project and replace every magic-string occurrence — including the elements of
|
|
`OperationalAuditRoles` and `AuditExportRoles` — with the constants. A single rename
|
|
will then either succeed everywhere or fail to compile.
|
|
|
|
**Resolution**
|
|
|
|
Resolved 2026-05-28 (commit pending). Added `src/ZB.MOM.WW.ScadaBridge.Security/Roles.cs` holding
|
|
`Admin`/`Design`/`Deployment`/`Audit`/`AuditReadOnly` as `public const string`
|
|
fields. Replaced every magic-string occurrence in this module:
|
|
`RoleMapper.MapGroupsToRolesAsync` now compares against `Roles.Deployment`;
|
|
`AuthorizationPolicies.AddScadaBridgeAuthorization` binds `RequireClaim(...)` to
|
|
`Roles.{Admin,Design,Deployment}`; `OperationalAuditRoles` /
|
|
`AuditExportRoles` are now built from `Roles.Admin`, `Roles.Audit`, `Roles.AuditReadOnly`.
|
|
`SiteScopeAuthorizationHandler.cs` was deleted under Security-017, so its
|
|
`"Deployment"` literal is gone with it. A future rename now propagates by a
|
|
single edit or fails to compile.
|
|
|
|
### Security-019 — Service-account rebind failure is reported as "Invalid username or password" — masks misconfiguration as a user-credential error
|
|
|
|
| | |
|
|
|--|--|
|
|
| Severity | Low |
|
|
| Category | Error handling & resilience |
|
|
| Status | Resolved |
|
|
| Location | `src/ZB.MOM.WW.ScadaBridge.Security/LdapAuthService.cs:85-89`, `:147-151` |
|
|
|
|
**Description**
|
|
|
|
After the user's credentials bind successfully, `AuthenticateAsync` re-binds as the
|
|
configured service account to perform the group/attribute search
|
|
(`connection.Bind(_options.LdapServiceAccountDn, _options.LdapServiceAccountPassword)`).
|
|
A failure of this second bind — wrong service-account password, deleted/disabled
|
|
service-account, locked-out service-account — throws `LdapException` which is caught by
|
|
the broad outer `catch (LdapException)` and returned as
|
|
`new LdapAuthResult(false, null, username, null, "Invalid username or password.")`.
|
|
The user sees an "invalid credentials" message for *their* credentials even though
|
|
their bind succeeded and the failure was in the system's own service-account
|
|
configuration. Worse, every user attempting to log in sees the same incorrect message
|
|
during a service-account outage, which routes operators down the wrong incident path
|
|
(reset the user's password) instead of the right one (check the service-account
|
|
credentials). The successful user bind itself is also not auditable as a discrete
|
|
event because the result is "Invalid username or password" — indistinguishable from a
|
|
genuine bad-password attempt.
|
|
|
|
**Recommendation**
|
|
|
|
Wrap the service-account rebind in its own `try`/`catch (LdapException)` and surface a
|
|
distinct error: log `_logger.LogError(ex, "Service-account rebind failed; check
|
|
LdapServiceAccountDn / LdapServiceAccountPassword configuration")` and return
|
|
`new LdapAuthResult(false, null, username, null, "Authentication service is misconfigured. Contact an administrator.")`.
|
|
Add a regression test that exercises the service-account-bind failure path (a mocked
|
|
or seamed `LdapConnection.Bind` that throws on the second call) and asserts the
|
|
distinct error message.
|
|
|
|
**Resolution**
|
|
|
|
Resolved 2026-05-28 (commit pending). Added a new `ServiceAccountBindException`
|
|
(`src/ZB.MOM.WW.ScadaBridge.Security/ServiceAccountBindException.cs`) — deliberately NOT an
|
|
`LdapException` subtype so it short-circuits the generic LDAP catch — and a private
|
|
`BindServiceAccountAsync` helper on `LdapAuthService` that wraps both service-account
|
|
rebind sites (the post-user-bind group-lookup rebind AND the `ResolveUserDnAsync`
|
|
DN-search rebind). On `LdapException`, the helper logs Error
|
|
("Service-account rebind failed; check LdapServiceAccountDn /
|
|
LdapServiceAccountPassword configuration") and rethrows as `ServiceAccountBindException`,
|
|
which the outer `AuthenticateAsync` catch chain maps to the distinct user-facing message
|
|
"Authentication service is misconfigured. Contact an administrator." Service-account
|
|
faults no longer surface as "Invalid username or password". Regression test
|
|
`ServiceAccountBindException_DoesNotInheritLdapException_SoCatchOrderIsCorrect` pins the
|
|
load-bearing inheritance contract; a full end-to-end auth test that exercises the
|
|
service-account-bind failure path is not feasible without an LDAP seam in
|
|
`LdapAuthService` (the `LdapConnection` is constructed in-method), so the structural test
|
|
is the closest meaningful unit-level coverage.
|
|
|
|
### Security-020 — `SecurityOptions` has no startup validation for required fields (`LdapServer`, `LdapSearchBase`)
|
|
|
|
| | |
|
|
|--|--|
|
|
| Severity | Low |
|
|
| Category | Error handling & resilience |
|
|
| Status | Resolved |
|
|
| Location | `src/ZB.MOM.WW.ScadaBridge.Security/SecurityOptions.cs:6-7`, `:36-37`; `src/ZB.MOM.WW.ScadaBridge.Security/ServiceCollectionExtensions.cs:13-30` |
|
|
|
|
**Resolution (2026-05-28):** added `SecurityOptionsValidator`
|
|
(`IValidateOptions<SecurityOptions>`) that rejects empty/whitespace
|
|
`LdapServer` and `LdapSearchBase` with messages naming the full
|
|
`Security:Field` key the operator would edit. `AddSecurity` registers it via
|
|
`services.AddOptions<SecurityOptions>().ValidateOnStart()` +
|
|
`TryAddEnumerable(... SecurityOptionsValidator)` so a misconfigured
|
|
`Security` section fails fast at boot rather than minutes later on the first
|
|
login. `JwtSigningKey` is deliberately left to `JwtTokenService`'s existing
|
|
length-aware constructor guard (Security-003). Regression tests in
|
|
`SecurityOptionsValidatorTests`: valid-options succeed; empty/whitespace
|
|
`LdapServer` and `LdapSearchBase` each fail with the key-naming message
|
|
(theory); both-empty reports both keys; `AddSecurity_RegistersSecurityOptionsValidator`
|
|
pins the DI wiring.
|
|
|
|
**Description**
|
|
|
|
`SecurityOptions.JwtSigningKey` correctly fails fast at `JwtTokenService` construction
|
|
(Security-003 fix), but the LDAP-side required fields — `LdapServer` (default
|
|
`string.Empty`) and `LdapSearchBase` (default `string.Empty`) — have no equivalent
|
|
guard. `AddSecurity` does not register an `IValidateOptions<SecurityOptions>`. A
|
|
deployment that fails to set `LdapServer` (a typo in the appsettings.json section name,
|
|
a missing environment-variable substitution, a misconfigured Docker compose file)
|
|
starts cleanly — the Central UI comes up, the login page loads, and only the first
|
|
authentication attempt fails with `LdapConnection.Connect("")` throwing a low-level
|
|
exception that bubbles up as the generic "An unexpected error occurred during
|
|
authentication." message. The misconfiguration surfaces minutes or hours into the
|
|
deploy, on the first real user login, rather than at startup where it is cheap to
|
|
diagnose.
|
|
|
|
**Recommendation**
|
|
|
|
Add an `IValidateOptions<SecurityOptions>` registered via
|
|
`services.AddOptions<SecurityOptions>().ValidateOnStart()` that fails when
|
|
`LdapServer` is null/whitespace, `LdapSearchBase` is null/whitespace, or
|
|
`LdapPort <= 0`. Combine with the existing `JwtTokenService` constructor check so
|
|
every required `SecurityOptions` field is enforced at startup, not at first use.
|
|
|
|
### Security-021 — `RequireHttpsCookie=false` dev opt-out has no warning path — an HTTP production deployment silently transmits the JWT bearer credential in cleartext
|
|
|
|
| | |
|
|
|--|--|
|
|
| Severity | Low |
|
|
| Category | Security |
|
|
| Status | Resolved |
|
|
| Location | `src/ZB.MOM.WW.ScadaBridge.Security/SecurityOptions.cs:100-108`; `src/ZB.MOM.WW.ScadaBridge.Security/ServiceCollectionExtensions.cs:54-59` |
|
|
|
|
**Resolution (2026-05-28):** Added `ILoggerFactory` to the cookie-options
|
|
`Configure` callback in `AddSecurity` so an explicit `RequireHttpsCookie=false`
|
|
opt-out now emits a startup `LogWarning` ("auth cookie SecurePolicy is
|
|
SameAsRequest. The cookie-embedded JWT will be transmitted in cleartext
|
|
over plain HTTP. This setting is intended for local dev only — set
|
|
SecurityOptions:RequireHttpsCookie=true in production."). Default is still
|
|
`true`, so production deployments unchanged. The "fail startup when
|
|
RequireHttpsCookie=false AND ASPNETCORE_ENVIRONMENT=Production" hard-stop
|
|
option was not implemented (the dev Docker cluster intentionally runs with
|
|
the flag false, and the env-var name varies across deploy mechanisms);
|
|
the warning is the right ergonomic floor.
|
|
|
|
**Description**
|
|
|
|
The Security-002 fix added `RequireHttpsCookie` (default `true`) so the auth cookie's
|
|
`SecurePolicy` is `Always` in production. The current Docker dev cluster sets
|
|
`RequireHttpsCookie=false` in both central nodes' `appsettings.Central.json`, downgrading
|
|
to `SameAsRequest` so the local HTTP cluster works. The downgrade is documented in the
|
|
XML doc but is silent at runtime: no log line warns that the cookie carrying the JWT
|
|
bearer credential is being sent over an HTTP-only path. A production deployment that
|
|
inherits a dev-derived appsettings — or that copy-pastes the docker config and forgets
|
|
to flip the flag — transmits the session token in cleartext with no diagnostic signal.
|
|
The default is correct; the gap is that the unsafe override has no operational guard.
|
|
|
|
**Recommendation**
|
|
|
|
In the `PostConfigure` block in `AddSecurity`, when `RequireHttpsCookie == false`, log
|
|
a single startup warning along the lines of `_logger.LogWarning("RequireHttpsCookie is
|
|
DISABLED — auth cookie SecurePolicy is SameAsRequest. The cookie-embedded JWT will be
|
|
transmitted over plain HTTP. This setting is intended for local dev only — set
|
|
SecurityOptions:RequireHttpsCookie=true in production.")`. Optionally, also fail
|
|
startup when `RequireHttpsCookie=false` AND `ASPNETCORE_ENVIRONMENT=Production`. Add a
|
|
regression test that asserts the warning is emitted when the flag is disabled and not
|
|
when it is enabled.
|
|
|
|
### Security-022 — `/auth/token` mints a bearer JWT that no consumer validates; the entire `JwtTokenService` validate/refresh surface is dead code
|
|
|
|
| | |
|
|
|--|--|
|
|
| Severity | Medium |
|
|
| Category | Design-document adherence |
|
|
| Status | Deferred |
|
|
| Location | `src/ZB.MOM.WW.ScadaBridge.Security/JwtTokenService.cs:149`, `:194`, `:209`, `:239`, `:272`; `src/ZB.MOM.WW.ScadaBridge.CentralUI/Auth/AuthEndpoints.cs:99-148` |
|
|
|
|
**Description**
|
|
|
|
The design (Component-Security.md "Cookie + JWT Hybrid", CLAUDE.md "cookie+JWT hybrid …
|
|
HMAC-SHA256, 15-minute expiry w/ sliding refresh") presents a cookie-embedded JWT as the
|
|
session model. At HEAD this is no longer how the interactive session works (the M2.19 note
|
|
acknowledges the cookie now carries **bare claims** policed by `CookieSessionValidator`),
|
|
and — more importantly — the JWT half is now an **orphaned, unconsumed credential**. A grep
|
|
across `src/` shows `JwtTokenService.ValidateToken`, `RefreshToken`, `RecordActivity`, and
|
|
`ShouldRefresh` have **zero production callers**; only `GenerateToken` is called, solely by
|
|
`POST /auth/token` (`AuthEndpoints.cs:134`). That endpoint returns an `access_token` to the
|
|
caller, but **no endpoint in the codebase accepts or validates that JWT** as a credential:
|
|
the CLI authenticates to the Management API with HTTP Basic Auth
|
|
(`CLI/ManagementHttpClient.cs:36`), the Inbound API authenticates with the external
|
|
`ZB.MOM.WW.Auth.ApiKeys` `sbk_<keyId>_<secret>` verifier, and the cookie pipeline uses
|
|
`SessionClaimBuilder` bare claims (never `ValidateToken`). The result: `/auth/token` issues
|
|
HMAC-signed bearer tokens that grant access to nothing, and ~120 lines of carefully-hardened
|
|
issuer/audience/clock-skew/idle/refresh logic (the subject of resolved findings
|
|
Security-006/007/014) are dead code. The hazard is twofold: (a) the design doc materially
|
|
overstates the live session mechanism, misleading any future reader/auditor; and (b) an
|
|
orphaned token mint is a latent footgun — a future endpoint added under the assumption that
|
|
`/auth/token` produces a *usable* credential would wire a second, parallel auth path with
|
|
no test coverage proving it actually validates.
|
|
|
|
**Recommendation**
|
|
|
|
Decide the intent explicitly. Either (a) **wire the bearer path**: add
|
|
`AddAuthentication().AddJwtBearer(...)` (or a custom handler delegating to
|
|
`JwtTokenService.ValidateToken`) on whatever surface is meant to accept the token, and add
|
|
an end-to-end test that a `/auth/token`-issued JWT authorizes a protected request — then the
|
|
hybrid model the doc describes is real; or (b) **remove the dead code**: delete
|
|
`/auth/token` and `JwtTokenService.ValidateToken`/`RefreshToken`/`RecordActivity`/
|
|
`ShouldRefresh`/`IsIdleTimedOut` (keep only what the bare-claim cookie path needs, which is
|
|
none of `JwtTokenService`), and rewrite Component-Security.md's "Cookie + JWT Hybrid" /
|
|
"Token Lifecycle" sections to describe the actual bare-claim cookie + `CookieSessionValidator`
|
|
mechanism as the primary model rather than a footnote. Whichever is chosen, the design doc
|
|
and the code must agree on whether a usable JWT exists.
|
|
|
|
**Resolution**
|
|
|
|
Deferred 2026-06-20: this needs a product decision — either wire a JWT-bearer auth handler so the documented cookie+JWT hybrid is real, or delete `/auth/token` + the orphaned `JwtTokenService` validate/refresh surface and rewrite the design doc to describe the actual bare-claim cookie model. Recorded for that decision.
|
|
|
|
### Security-023 — `LdapGroupMapping.Role` is an unvalidated free string; case-sensitive UI policy vs case-insensitive server authz grants inconsistent access
|
|
|
|
| | |
|
|
|--|--|
|
|
| Severity | Medium |
|
|
| Category | Correctness & logic bugs |
|
|
| Status | Resolved |
|
|
| Location | `src/ZB.MOM.WW.ScadaBridge.Security/AuthorizationPolicies.cs:144-157`; `src/ZB.MOM.WW.ScadaBridge.Security/RoleMapper.cs:40-42`; `src/ZB.MOM.WW.ScadaBridge.Commons/Entities/Security/LdapGroupMapping.cs:10`; `src/ZB.MOM.WW.ScadaBridge.ManagementService/ManagementActor.cs:106`, `:1893` |
|
|
|
|
**Description**
|
|
|
|
The role string an operator maps an LDAP group to (`LdapGroupMapping.Role`) is a free
|
|
`string` with no validation against the canonical `Roles.*` set anywhere on the write path:
|
|
`ManagementActor` (`:1893`) does `new LdapGroupMapping(cmd.LdapGroupName, cmd.Role)` verbatim,
|
|
and the CLI `security role-mapping create --role <value>`
|
|
(`CLI/Commands/SecurityCommands.cs:157`) accepts any string. The Central UI dropdown
|
|
(`LdapMappingForm.razor`) constrains the *UI* path to the canonical constants, but the CLI/API
|
|
path does not. Two enforcement surfaces then disagree on casing:
|
|
|
|
- **Central UI authorization** uses `policy.RequireClaim(JwtTokenService.RoleClaimType, Roles.Deployer)`
|
|
(`AuthorizationPolicies.cs:150`). ASP.NET Core's `RequireClaim` compares claim *values*
|
|
with ordinal (case-**sensitive**) equality.
|
|
- **ManagementActor authorization** uses `user.Roles.Contains(requiredRole, StringComparer.OrdinalIgnoreCase)`
|
|
(`ManagementActor.cs:106`) — case-**insensitive**.
|
|
- **`RoleMapper`** matches the Deployer scope branch with `StringComparison.OrdinalIgnoreCase`
|
|
(`RoleMapper.cs:42`) but stamps the **DB row's verbatim casing** into the role claim.
|
|
|
|
So a mapping created via CLI/API as `--role deployer` (lowercase) or a typo like `Deploy`
|
|
produces a principal whose role claim is `"deployer"`: the **ManagementActor/CLI surface
|
|
authorizes it** (case-insensitive), the **Central UI `RequireDeployment` policy denies it**
|
|
(case-sensitive), and for the Deployer case `RoleMapper` still resolves site scope (matching
|
|
case-insensitively) — a user who can deploy via CLI but is locked out of the equivalent UI
|
|
pages, or vice versa, with no error explaining why. A pure typo (`"Deploer"`) silently grants
|
|
a role claim that *no* policy and no ManagementActor check matches — the user appears to hold
|
|
a role yet is denied everywhere, with no validation feedback at mapping-creation time.
|
|
|
|
**Recommendation**
|
|
|
|
Validate `cmd.Role` against the canonical `Roles.All` set (case-insensitively, normalising to
|
|
the canonical casing) when creating/updating a mapping — in `ManagementActor` (the single
|
|
server-side write path) and reject non-canonical values with a clear error; optionally also
|
|
guard in the CLI for a fast local failure. Separately, make the two authorization surfaces
|
|
agree on casing: either normalise the role claim to canonical casing in `SessionClaimBuilder`/
|
|
`RoleMapper` before it is stamped, or make the ManagementActor check case-sensitive to match
|
|
`RequireClaim` (the safer direction once values are validated). Add a regression test for a
|
|
mis-cased mapping role.
|
|
|
|
**Resolution**
|
|
|
|
Resolved 2026-06-20 (commit `fd618cf1`): added membership validation rejecting any `cmd.Role` not in `Roles.All` (canonical set, case-insensitive) at LDAP-group-mapping create AND update, before any DB write — closing the unvalidated-free-string gap. The casing/comparison-asymmetry half is a larger change and remains a recorded follow-up (membership check is safe: a non-canonical role never worked). No existing test regressed.
|
|
|
|
### Security-024 — `RequireOperator` / `RequireVerifier` policies have no functional authorization test
|
|
|
|
| | |
|
|
|--|--|
|
|
| Severity | Medium |
|
|
| Category | Testing coverage |
|
|
| Status | Resolved |
|
|
| Location | `src/ZB.MOM.WW.ScadaBridge.Security/AuthorizationPolicies.cs:153-157`; `tests/ZB.MOM.WW.ScadaBridge.Security.Tests/SecurityTests.cs:1045-1186` (AuthorizationPolicyTests); `tests/ZB.MOM.WW.ScadaBridge.Security.Tests/RolesTests.cs:44-45` |
|
|
|
|
**Description**
|
|
|
|
The M7 two-person Secured Write feature added the `RequireOperator` and `RequireVerifier`
|
|
policies (`AuthorizationPolicies.cs:153-157`), each a single-role `RequireClaim`. These gate a
|
|
SCADA control-surface write workflow whose entire safety argument is separation of duties, so
|
|
correct grant/deny behaviour is security-critical. But `AuthorizationPolicyTests` — which
|
|
exercises every *other* policy with real `IAuthorizationService.AuthorizeAsync` evaluations
|
|
(Admin/Design/Deployment/OperationalAudit/AuditExport, including the load-bearing
|
|
"Viewer reads but cannot export" SoD case) — has **no** Operator/Verifier test. The only
|
|
coverage is `RolesTests.cs:44-45`, which asserts the *constant string values*
|
|
(`"RequireOperator"`, `"RequireVerifier"`) — not that an `Operator` principal satisfies
|
|
`RequireOperator`, that a `Verifier` does not, or that the two policies are mutually distinct.
|
|
A regression that, say, mapped `RequireOperator` to `Roles.Verifier` (or to
|
|
`OperationalAuditRoles`) would compile and pass the existing tests while silently collapsing
|
|
the separation of duties.
|
|
|
|
**Recommendation**
|
|
|
|
Add `AuthorizeAsync`-based `[Theory]` cases to `AuthorizationPolicyTests` mirroring the
|
|
existing policy tests: `RequireOperator` succeeds for `Roles.Operator` and fails for
|
|
`Roles.Verifier`/`Administrator`/empty; `RequireVerifier` succeeds for `Roles.Verifier` and
|
|
fails for `Roles.Operator`; and a combined case asserting an Operator-only principal cannot
|
|
satisfy `RequireVerifier` (the SoD invariant at the policy layer).
|
|
|
|
**Resolution**
|
|
|
|
Resolved 2026-06-20 (commit `fd618cf1`): added functional `AuthorizeAsync` grant/deny tests for `RequireOperator`/`RequireVerifier` (and a combined SoD-distinctness test), guarding the separation-of-duties policy wiring against regression.
|
|
|
|
### Security-025 — Unit test performs a live network connection to `nonexistent.invalid:9999`
|
|
|
|
| | |
|
|
|--|--|
|
|
| Severity | Low |
|
|
| Category | Testing coverage |
|
|
| Status | Resolved |
|
|
| Location | `tests/ZB.MOM.WW.ScadaBridge.Security.Tests/SecurityTests.cs:71-89` (`AuthenticateAsync_ConnectionFailure_FailsClosed_NeverThrows`) |
|
|
|
|
**Description**
|
|
|
|
`LdapAuthServiceTests.AuthenticateAsync_ConnectionFailure_FailsClosed_NeverThrows` constructs
|
|
the external library's `LdapAuthService` pointed at `Server = "nonexistent.invalid"`,
|
|
`Port = 9999` with a 2-second `ConnectionTimeoutMs`, then asserts the result fails closed.
|
|
Although the test relies on the connection *failing*, it still performs a real DNS resolution
|
|
and TCP connect attempt from the unit-test process. In an offline or network-sandboxed CI
|
|
environment the resolution/connect can behave differently (immediate failure vs. the full
|
|
2-second timeout vs. a captive-portal redirect), making the test both **slow** (up to ~2s of
|
|
wall-clock dead time) and **environment-dependent**. Unit tests in this suite are otherwise
|
|
network-free (the sibling cases are explicitly named `_NoNetwork`). The behaviour under test
|
|
(fail-closed mapping of an unreachable directory) is really the external library's
|
|
responsibility and is, per the file's own header comment, already covered by the library's own
|
|
test suite through its internal connection seam.
|
|
|
|
**Recommendation**
|
|
|
|
Drop this case from the ScadaBridge unit suite (the library owns the fail-closed contract via
|
|
its internal `ILdapConnection` seam, as the region comment notes), or move it to an explicitly
|
|
opt-in integration-test category so it does not run in the default network-free unit pass. If
|
|
kept as a smoke test, point it at a guaranteed-unroutable loopback address (e.g.
|
|
`127.0.0.1` on a closed port) to bound the failure deterministically rather than relying on
|
|
DNS resolution of `.invalid`.
|
|
|
|
**Resolution**
|
|
|
|
Resolved 2026-06-20 (commit `fd618cf1`): the fail-closed connection test now targets `127.0.0.1` on a closed port instead of `nonexistent.invalid:9999` — deterministic loopback connection-refused, no external DNS/timeout dependency. Same assertion preserved.
|