# Code Review — Security | Field | Value | |-------|-------| | Module | `src/ZB.MOM.WW.ScadaBridge.Security` | | Design doc | `docs/requirements/Component-Security.md` | | Status | Reviewed | | Last reviewed | 2026-05-28 | | Reviewer | claude-agent | | Commit reviewed | `1eb6e97` | | Open findings | 0 (1 deferred — Security-008) | ## Summary The Security module is small and reasonably structured: a stateless `LdapAuthService` for search-then-bind authentication, a `JwtTokenService` for HMAC-signed cookie tokens, a `RoleMapper` that resolves LDAP groups to roles, and ASP.NET Core authorization policies plus a site-scope handler. Unit-test coverage of the happy paths is decent. However, the review surfaced several real security weaknesses, the most serious being that **StartTLS is dead code** (the design's "LDAPS or StartTLS" requirement is only half met), that **the authentication cookie is not marked `Secure`** despite the design mandating it, and that **the JWT signing key is never length-validated** so a weak or empty key is silently accepted. There is also a genuine **DN-injection** gap in the no-service-account fallback path, a filter/DN attribute mismatch (`uid=` vs `cn=`) that makes that fallback path internally inconsistent, and an N+1 query in `RoleMapper`. JWT validation also disables issuer/audience checks and the idle-timeout claim is reset on every refresh, weakening the documented 30-minute idle policy. None of these are crash/data-loss bugs, but the TLS, cookie, and key-validation items are security defects that should be fixed before any production deployment. #### Re-review 2026-05-17 (commit `39d737e`) Re-reviewed the module after the batch-resolution of Security-001..007, 009, 010, 011. Those fixes were verified in place and remain Resolved: the `LdapTransport` enum makes StartTLS reachable, the auth cookie is `SecurePolicy.Always` with a sliding window, the JWT signing key is length-validated at construction, issuer/audience are bound and checked, the DN-injection fallback is RFC 4514-escaped, and the idle-timeout anchor is carried across refreshes. Security-008 (N+1 scope-rule query) remains correctly **Deferred** — `ISecurityRepository` still exposes no per-set scope-rule query, so the N+1 cannot be removed from within `RoleMapper.cs` alone; the deferral condition is unchanged. This pass surfaced **4 new findings** (Security-012..015): two Medium and two Low. The most notable is that a *partial* LDAP outage during login (bind succeeds, group search fails) silently returns an authenticated session with **zero roles** instead of failing the login as the design's LDAP-failure rule requires (Security-012); and that `RefreshToken` re-issues a token without ever checking `IsIdleTimedOut`, so an idle-expired session can be renewed indefinitely if the caller omits the separate idle check (Security-014). The two Low findings concern fragile DN parsing of group names containing escaped commas and an un-trimmed username flowing into the LDAP filter, fallback DN, and JWT claims. #### Re-review 2026-05-28 (commit `1eb6e97`) Re-reviewed the module on a fresh baseline. All Security-001..007, 009..015 fixes remain in place; the only Open carry-over is Security-008 (still correctly **Deferred** — `ISecurityRepository` still exposes no per-set scope-rule query, so the N+1 in `RoleMapper` cannot be removed from within this module). The original Security-014 fix is now load-bearing: `RefreshToken` calls `IsIdleTimedOut` before re-issuing, and the new cookie sliding-expiry tests in `SecurityReviewRegressionTests` pin CentralUI-005's Security-side contract. This pass surfaced **6 new findings** (Security-016..021): one Medium correctness/security defect, one Medium design-adherence defect, and four Low. The most consequential is **Security-016** — when a user is mapped to *both* a system-wide Deployment LDAP group (e.g. `SCADA-Deploy-All`) and a site-scoped Deployment LDAP group (e.g. `SCADA-Deploy-SiteA`), `RoleMapper` silently treats the union as site-scoped (the system-wide grant is dropped); the design's "multiple groups grant multiple independent roles" intent is not honoured for this mix-and-match case. **Security-017** is the cross-module partner of CentralUI-028: `SiteScopeRequirement` / `SiteScopeAuthorizationHandler` are declared and registered but no production caller ever instantiates them — `[Authorize(Policy = RequireDeployment)]` *does not* enforce the documented site scoping, callers must remember to inject `SiteScopeService` and re-check `IsSiteAllowedAsync` themselves (which the two new report pages flagged by CentralUI-028 forgot to do). The remaining Lows are: role names are magic strings duplicated across `RoleMapper`, `SiteScopeAuthorizationHandler`, and `AuthorizationPolicies` (Security-018); a service-account-rebind failure is reported to the user as "Invalid username or password" — masking a misconfiguration as a user-credential error (Security-019); required `SecurityOptions` fields (`LdapServer`, `LdapSearchBase`) have no `IValidateOptions` startup check, so empty values silently surface only on first login (Security-020); and the `RequireHttpsCookie=false` dev opt-out emits no warning, so an HTTP production deployment silently transmits the JWT bearer credential in cleartext (Security-021). ## Checklist coverage | # | Category | Examined | Notes | |---|----------|----------|-------| | 1 | Correctness & logic bugs | ☑ | `uid=`/`cn=` attribute mismatch between search filter and fallback DN construction (Security-004); StartTLS branch is unreachable (Security-001). | | 2 | Akka.NET conventions | ☑ | No actors in this module — `AddSecurityActors` is an empty placeholder. Nothing to assess. | | 3 | Concurrency & thread safety | ☑ | Services are stateless and DI-scoped; LDAP sync calls wrapped in `Task.Run`. No shared mutable state. No issues found. | | 4 | Error handling & resilience | ☑ | LDAP failure paths return structured `LdapAuthResult`; group-lookup failure is tolerated per design. `ct` not honored inside `Task.Run` bodies (Security-009). | | 5 | Security | ☑ | StartTLS dead code (Security-001), cookie not `Secure` (Security-002), JWT key unvalidated (Security-003), DN injection (Security-005), no issuer/audience validation (Security-006), idle-timeout reset on refresh (Security-007). | | 6 | Performance & resource management | ☑ | N+1 scope-rule query in `RoleMapper` (Security-008). `LdapConnection` correctly disposed via `using`. | | 7 | Design-document adherence | ☑ | StartTLS unsupported and Secure cookie missing both contradict the design doc; design also says "Windows Integrated Authentication" in Responsibilities, contradicting its own Authentication section (Security-010). | | 8 | Code organization & conventions | ☑ | `SecurityOptions` correctly owned by the component; repository interface in Commons. No issues found. | | 9 | Testing coverage | ☑ | No tests for `RoleMapper` N+1 behavior, DN-injection inputs, StartTLS path, or idle-timeout-after-refresh. Insecure-config combinations under-tested (Security-011). | | 10 | Documentation & comments | ☑ | `SecurityOptions` XML docs say direct bind uses `cn={username}` while the search filter uses `uid=` — comment is misleading (covered under Security-004). | _Re-review (2026-05-28, `1eb6e97`):_ | # | Category | Examined | Notes | |---|----------|----------|-------| | 1 | Correctness & logic bugs | ☑ | `RoleMapper` drops a system-wide Deployment grant when the user is also in any site-scoped Deployment group (Security-016); hard-coded role-name string `"Deployment"` in two separate places allows a refactor to silently break site scoping (Security-018). | | 2 | Akka.NET conventions | ☑ | No actors. `AddSecurityActors` is still a registration placeholder. No issues. | | 3 | Concurrency & thread safety | ☑ | Services stateless; LDAP sync calls wrapped in `Task.Run` with the now-bounded timeout (Security-009 resolution holds). No issues found. | | 4 | Error handling & resilience | ☑ | A service-account-rebind failure inside `AuthenticateAsync` is reported as "Invalid username or password", masking a misconfiguration as a user-credential error (Security-019). LDAP-failure rule + partial-outage path remain correctly enforced post-Security-012. | | 5 | Security | ☑ | `SiteScopeRequirement` / `SiteScopeAuthorizationHandler` are dead code — no policy is registered that uses them and no production caller instantiates them, so declarative `[Authorize]` does not enforce site scoping (Security-017, cross-module partner of CentralUI-028). `RequireHttpsCookie=false` dev opt-out has no warning path — a production misconfiguration silently transmits the JWT bearer credential over HTTP (Security-021). | | 6 | Performance & resource management | ☑ | Security-008 N+1 remains correctly Deferred (still gated on `ISecurityRepository`). No new perf issues. | | 7 | Design-document adherence | ☑ | `RoleMapper`'s drop-system-wide-on-any-scoped behaviour (Security-016) contradicts the design's "A user can hold multiple roles simultaneously … roles are independent — there is no implied hierarchy" rule for the union case; `SiteScopeRequirement` advertises a site-scope authorization pattern the implementation does not actually wire up (Security-017). | | 8 | Code organization & conventions | ☑ | Role-name strings are duplicated as magic literals across `RoleMapper.cs`, `SiteScopeAuthorizationHandler.cs`, and `AuthorizationPolicies.cs` — only the audit roles have a single source of truth via `OperationalAuditRoles` / `AuditExportRoles` (Security-018). `SecurityOptions` defaults pass through to runtime with no `IValidateOptions` for required fields like `LdapServer` / `LdapSearchBase` (Security-020). | | 9 | Testing coverage | ☑ | No test covers a user mapped to both a system-wide AND a site-scoped Deployment LDAP group (the Security-016 case). No test covers the `SiteScopeRequirement` cross-page integration — tests evaluate the handler in isolation, not the absence of a policy that uses it (Security-017). | | 10 | Documentation & comments | ☑ | `SiteScopeAuthorizationHandler` XML doc describes a permission model no caller actually invokes (Security-017). Otherwise stable. | ## Findings ### Security-001 — StartTLS upgrade path is unreachable dead code | | | |--|--| | Severity | High | | Category | Security | | Status | Resolved | | Location | `src/ZB.MOM.WW.ScadaBridge.Security/LdapAuthService.cs:37-47` | **Description** When `LdapUseTls` is true the code sets `connection.SecureSocketLayer = true` (LDAPS). The subsequent StartTLS block is guarded by `if (_options.LdapUseTls && !connection.SecureSocketLayer)`. Because `SecureSocketLayer` was just set to `true`, the second condition `!connection.SecureSocketLayer` is always false, so `connection.StartTls()` is never called. The design doc explicitly states LDAP connections must use **"LDAPS (port 636) or StartTLS"** — StartTLS is in practice unsupported. A deployment that intends to use StartTLS on port 389 would get a plaintext LDAPS-mode connection attempt that fails, or worse, an operator may disable TLS entirely to make it work, sending credentials in cleartext. **Recommendation** Introduce an explicit transport mode (e.g. `LdapTransport { Ldaps, StartTls, None }`) or a separate `LdapUseStartTls` flag. For StartTLS, leave `SecureSocketLayer` false, call `connection.Connect`, then call `connection.StartTls()` and verify the negotiated session is encrypted before binding. Remove the unreachable conditional. **Resolution** Resolved 2026-05-16 (commit ``). Added an explicit `LdapTransport` enum (`Ldaps`/`StartTls`/`None`); `SecureSocketLayer` is set only for LDAPS, and the StartTLS branch now connects in plaintext, calls `StartTls()`, and verifies `connection.Tls` before binding. `LdapUseTls` is retained as a compatibility shim mapping onto the enum. Regression tests `AuthenticateAsync_StartTlsTransport_AttemptsConnection` and `AuthenticateAsync_NoTlsTransport_RejectedWithoutAllowInsecure`. ### Security-002 — Authentication cookie is not marked `Secure` | | | |--|--| | Severity | High | | Category | Security | | Status | Resolved | | Location | `src/ZB.MOM.WW.ScadaBridge.Security/ServiceCollectionExtensions.cs:16-23` | **Description** `AddCookie` sets `HttpOnly = true` and `SameSite = Strict` but never sets `options.Cookie.SecurePolicy`. The ASP.NET Core default is `CookieSecurePolicy.SameAsRequest`, which permits the cookie (carrying the embedded JWT — a bearer credential) to be sent over plain HTTP. The design doc states the cookie is **"HttpOnly and Secure (requires HTTPS)"**. As written, the module does not enforce that requirement; a misconfigured or HTTP-fronted deployment would transmit the session token in cleartext. **Recommendation** Set `options.Cookie.SecurePolicy = CookieSecurePolicy.Always` in `AddCookie`. Consider also setting `ExpireTimeSpan` and `SlidingExpiration` to align the cookie lifetime with the documented 15-minute JWT / 30-minute idle policy. **Resolution** Resolved 2026-05-16 (commit ``). Confirmed the cookie is configured in this module (`ServiceCollectionExtensions.AddSecurity`), not CentralUI — no misattribution. Added `options.Cookie.SecurePolicy = CookieSecurePolicy.Always` so the JWT-bearing cookie is never sent over plain HTTP. Regression test `AddSecurity_AuthCookie_IsMarkedSecureAlways`. (`ExpireTimeSpan`/`SlidingExpiration` tuning left as a separate, lower-priority improvement.) ### Security-003 — JWT signing key length is never validated | | | |--|--| | Severity | High | | Category | Security | | Status | Resolved | | Location | `src/ZB.MOM.WW.ScadaBridge.Security/JwtTokenService.cs:33`, `src/ZB.MOM.WW.ScadaBridge.Security/SecurityOptions.cs:42` | **Description** `SecurityOptions.JwtSigningKey` defaults to `string.Empty` and is fed directly into `new SymmetricSecurityKey(Encoding.UTF8.GetBytes(_options.JwtSigningKey))` with no validation. HMAC-SHA256 requires a key of at least 256 bits (32 bytes); a short or empty key produces a trivially forgeable token. The `SecurityHardeningTests` comment claims a minimum length is "enforced", but no code in this module enforces it — the test only asserts that a 32+ char key works. A deployment with a missing or short `JwtSigningKey` would start successfully and issue weakly-signed tokens. **Recommendation** Validate `JwtSigningKey` at startup — fail fast if it is empty or shorter than 32 bytes. Use an `IValidateOptions` validator or guard in the `JwtTokenService` constructor so a weak key is rejected before any token is issued. **Resolution** Resolved 2026-05-16 (commit ``). The `JwtTokenService` constructor now fails fast with an `InvalidOperationException` when `JwtSigningKey` is empty or shorter than 32 bytes (`SecurityOptions.MinJwtSigningKeyBytes`), so a weak key is rejected before any token is issued. The misleading `SecurityOptions` XML doc was corrected to state the requirement. Regression tests `JwtTokenService_EmptySigningKey_ThrowsAtConstruction`, `JwtTokenService_ShortSigningKey_ThrowsAtConstruction`, and `JwtTokenService_AdequateSigningKey_ConstructsSuccessfully`. ### Security-004 — Search filter uses `uid=` while fallback DN construction uses `cn=` | | | |--|--| | Severity | Medium | | Category | Correctness & logic bugs | | Status | Resolved | | Location | `src/ZB.MOM.WW.ScadaBridge.Security/LdapAuthService.cs:66`, `:138`, `:157-159` | **Description** `AuthenticateAsync` and `ResolveUserDnAsync` build the search filter as `(uid={username})`, but the no-service-account fallback in `ResolveUserDnAsync` constructs the bind DN as `cn={username},{LdapSearchBase}`. The `SecurityOptions.LdapServiceAccountDn` XML comment also documents the fallback as `cn={username},{LdapSearchBase}`. A directory keyed on `uid` will succeed via search-then-bind but fail via the direct-bind fallback (and vice versa). The attribute used for lookup is hard-coded and inconsistent across the two code paths, so the two configuration modes are not interchangeable. **Recommendation** Introduce a single configurable `LdapUserIdAttribute` (default `uid`) and use it consistently in both the search filter and the fallback DN. Update the XML doc to match. **Resolution** Resolved 2026-05-16 (commit `pending`). Confirmed: the search filter was hard-coded `(uid={username})` (both in `AuthenticateAsync` and `ResolveUserDnAsync`) while the fallback DN used `cn={username}` — the two auth modes were not interchangeable. Added a configurable `SecurityOptions.LdapUserIdAttribute` (default `uid`) used for both the search filter and the fallback DN via the new `BuildFallbackUserDn` helper, and corrected the `LdapServiceAccountDn` XML doc to reference `{LdapUserIdAttribute}`. Regression tests `BuildFallbackUserDn_UsesConfiguredUserIdAttribute`, `BuildFallbackUserDn_HonoursNonDefaultUserIdAttribute`, `SecurityOptions_LdapUserIdAttribute_DefaultsToUid`. ### Security-005 — DN injection in the no-service-account bind fallback | | | |--|--| | Severity | Medium | | Category | Security | | Status | Resolved | | Location | `src/ZB.MOM.WW.ScadaBridge.Security/LdapAuthService.cs:157-159` | **Description** When no service account is configured, the user-supplied `username` is interpolated directly into a distinguished name: `$"cn={username},{LdapSearchBase}"`. `EscapeLdapFilter` escapes *search-filter* metacharacters, but DN construction requires a different escaping scheme (RFC 4514 — `,`, `+`, `"`, `\`, `<`, `>`, `;`, leading/trailing spaces). No DN escaping is applied here. A username such as `victim,ou=admins` alters the DN structure, allowing a caller to attempt a bind as a different DN than intended. Combined with the `username.Contains('=')` shortcut at line 129 — which lets a caller supply a full arbitrary DN — the fallback path gives the client undue control over the bind identity. **Recommendation** Apply RFC 4514 DN-component escaping to `username` before interpolation, or use the LDAP library's DN-builder API. Reconsider the `Contains('=')` shortcut — accepting a raw DN from untrusted input is risky; restrict it or remove it. **Resolution** Resolved 2026-05-16 (commit `pending`). Confirmed: the fallback path interpolated the raw `username` straight into `cn={username},{LdapSearchBase}` with no DN escaping, and the `username.Contains('=')` shortcut let a caller supply an arbitrary bind DN. Added an RFC 4514 `EscapeLdapDn` helper (escapes `, + " \ < > ;`, leading/trailing space, leading `#`, NUL) applied in `BuildFallbackUserDn`, so a username such as `victim,ou=admins` can no longer alter the DN structure. The `Contains('=')` raw-DN shortcut was removed entirely — untrusted input no longer controls the bind identity. Regression tests `BuildFallbackUserDn_EscapesDnMetacharacters`, `EscapeLdapDn_EscapesAllRfc4514Specials`, `EscapeLdapDn_EscapesLeadingAndTrailingSpaces`. ### Security-006 — JWT validation disables issuer and audience checks | | | |--|--| | Severity | Medium | | Category | Security | | Status | Resolved | | Location | `src/ZB.MOM.WW.ScadaBridge.Security/JwtTokenService.cs:67-75`, `:56-59` | **Description** `ValidateToken` sets `ValidateIssuer = false` and `ValidateAudience = false`, and `GenerateToken` never sets an `iss` or `aud`. With a shared symmetric HMAC key, any other system or component that signs JWTs with the same key would produce tokens this service accepts. While the design states the key is shared only between the two central nodes, omitting issuer/audience binding removes a cheap defense-in-depth control and makes accidental key reuse (e.g. the same secret used for another internal token) silently exploitable. **Recommendation** Set a fixed `Issuer` and `Audience` (e.g. `"scadabridge-central"`) when generating tokens and enable `ValidateIssuer`/`ValidateAudience` with the matching expected values during validation. **Resolution** Resolved 2026-05-16 (commit `pending`). Confirmed: `GenerateToken` set neither `iss` nor `aud` and `ValidateToken` had `ValidateIssuer = false`/`ValidateAudience = false`. `GenerateToken` now binds `JwtTokenService.TokenIssuer`/`TokenAudience` (both `"scadabridge-central"`) into every token, and `ValidateToken` enables `ValidateIssuer`/`ValidateAudience` against those fixed values — a token signed with the shared key but a foreign issuer is now rejected. Regression tests `GenerateToken_SetsIssuerAndAudience`, `ValidateToken_RejectsTokenWithWrongIssuer`, `ValidateToken_AcceptsOwnIssuerAndAudience`. ### Security-007 — Idle-timeout claim is reset on every token refresh | | | |--|--| | Severity | Medium | | Category | Correctness & logic bugs | | Status | Resolved | | Location | `src/ZB.MOM.WW.ScadaBridge.Security/JwtTokenService.cs:40`, `:111-123` | **Description** The design states the 30-minute idle timeout is tracked via a "last-activity timestamp in the token", and `IsIdleTimedOut` reads the `LastActivity` claim. But `RefreshToken` calls `GenerateToken`, which unconditionally writes `LastActivity = DateTimeOffset.UtcNow`. Token refresh fires whenever a request arrives within ~5 minutes of expiry. The result is that `LastActivity` reflects *token issuance time*, not genuine user activity — and since refresh itself is a request, the timestamp keeps moving forward. A more subtle consequence: the idle window is effectively measured from the last refresh, not the last real interaction, so the documented "no requests within the idle window" semantics are not faithfully implemented. The claim name `LastActivity` is also misleading. **Recommendation** Decide explicitly how activity is tracked. Either (a) carry the original `LastActivity` forward across refreshes and update it only on real request handling in the middleware, or (b) rename the claim to `IssuedAt`/`TokenCreated` and document that the idle window is measured from issuance. Whichever is chosen, ensure `IsIdleTimedOut` and the refresh path agree on the semantics. **Resolution** Resolved 2026-05-16 (commit `pending`). Confirmed: `RefreshToken` → `GenerateToken` unconditionally wrote `LastActivity = UtcNow`, so the idle clock reset on every refresh and the documented 30-minute idle timeout could never fire for a client that polls in the background. Implemented option (a) — the Security-side half of the documented "15-min sliding + 30-min idle" policy (the cross-module partner of CentralUI-005): `GenerateToken` now takes an optional `lastActivity` anchor; `RefreshToken` carries the **existing** `LastActivity` claim forward unchanged, and a new explicit `RecordActivity` method advances the anchor to now — to be called by the CentralUI request pipeline on genuine user interaction (not on a background refresh). `IsIdleTimedOut` is unchanged and now agrees with the refresh path. The remaining CentralUI-side wiring (calling `RecordActivity` from the middleware, setting `SlidingExpiration`/`ExpireTimeSpan`) stays tracked under CentralUI-005; this finding's Security-side defect — the reset-on-refresh bug — is fully fixed here. Regression tests `RefreshToken_PreservesOriginalLastActivityClaim`, `RefreshToken_DoesNotResetIdleTimeoutWhenUserIsActuallyIdle`, `RecordActivity_UpdatesLastActivityToNow`. ### Security-008 — N+1 query loading site-scope rules in `RoleMapper` | | | |--|--| | Severity | Low | | Category | Performance & resource management | | Status | Deferred | | Location | `src/ZB.MOM.WW.ScadaBridge.Security/RoleMapper.cs:25-48` | **Description** `MapGroupsToRolesAsync` first calls `GetAllMappingsAsync`, then inside the per-mapping loop calls `GetScopeRulesForMappingAsync(mapping.Id, ct)` once for every matched Deployment mapping. This is an N+1 query pattern executed on the login hot path and on every 15-minute token refresh. With multiple site-scoped Deployment groups it issues a round-trip per group. **Recommendation** Add a repository method that loads scope rules for a set of mapping IDs in one query (or eager-loads them with the mappings), and resolve all scope rules with a single call. **Resolution** Deferred 2026-05-16 (commit `pending commit`). Finding confirmed accurate: `RoleMapper.MapGroupsToRolesAsync` issues one `GetScopeRulesForMappingAsync` round-trip per matched Deployment mapping — a genuine N+1 on the login / 15-minute-refresh path. However, the only correct fix (a batch `GetScopeRulesForMappingsAsync(IEnumerable)` repository method, or an eager-load navigation property) requires changes to `ISecurityRepository` (`src/ZB.MOM.WW.ScadaBridge.Commons`) and `SecurityRepository` (`src/ZB.MOM.WW.ScadaBridge.ConfigurationDatabase`). Both are outside the Security module's permitted edit scope for this review pass, and the existing `ISecurityRepository` surface offers no per-set scope-rule query, so the N+1 cannot be removed from within `RoleMapper.cs` alone. Severity is Low (bounded by the number of site-scoped Deployment groups, typically small). Deferred to a cross-module change that adds the batch repository method; `RoleMapper` should then resolve all scope rules in a single call. ### Security-009 — CancellationToken not honored inside `Task.Run` LDAP calls | | | |--|--| | Severity | Low | | Category | Error handling & resilience | | Status | Resolved | | Location | `src/ZB.MOM.WW.ScadaBridge.Security/LdapAuthService.cs:42`, `:46`, `:51`, `:56-57`, `:67-73`, `:135`, `:139-145` | **Description** The synchronous Novell LDAP calls are wrapped in `Task.Run(() => ..., ct)`. The `ct` argument only prevents the work item from *starting* if cancellation is already signaled; once a `connection.Connect`/`Bind`/`Search` call is in progress it cannot be cancelled. A cancelled or timed-out login request will continue to occupy a thread-pool thread and an LDAP connection until the blocking call returns on its own. There is also no explicit network/operation timeout configured on the `LdapConnection`. **Recommendation** Configure `LdapConnection.ConnectionTimeout` and search/operation time limits so a hung LDAP server cannot pin a thread indefinitely. Document that `ct` only guards work-item scheduling, or implement a timeout-with-disconnect fallback. **Resolution** Resolved 2026-05-16 (commit `pending commit`). Confirmed: the synchronous Novell LDAP calls were wrapped in `Task.Run(..., ct)` where `ct` only prevents the work item from starting and cannot interrupt an in-progress blocking `Connect`/`Bind`/`Search`, and no network/operation timeout was configured on the `LdapConnection`. Added a configurable `SecurityOptions.LdapConnectionTimeoutMs` (default 10s) and a new `ApplyConnectionTimeout` helper that sets both `LdapConnection.ConnectionTimeout` (socket connect) and `LdapConstraints.TimeLimit` (per-operation limit) before connecting, so a hung LDAP server can no longer pin a thread-pool thread indefinitely. The `ct`-only-guards-scheduling limitation is now documented in the option's XML doc and an inline comment. Regression tests `SecurityOptions_LdapConnectionTimeout_HasSaneDefault` and `AuthenticateAsync_UnreachableHost_FailsWithinConfiguredTimeout`. ### Security-010 — Design doc contradicts itself on Windows Integrated Authentication | | | |--|--| | Severity | Low | | Category | Design-document adherence | | Status | Resolved | | Location | `docs/requirements/Component-Security.md:13` (vs. `:23`) | **Description** The Responsibilities section states the component authenticates "using Windows Integrated Authentication", but the Authentication section (line 23) and CLAUDE.md explicitly state **"No Windows Integrated Authentication ... authenticates directly against LDAP/AD, not via Kerberos/NTLM"** — which is what the code actually does (direct LDAP bind). The Responsibilities line is stale and contradicts both the rest of the doc and the implementation. **Recommendation** Fix `Component-Security.md:13` to say "using a direct LDAP/Active Directory bind" to match the implemented behavior and the rest of the document. **Resolution** Resolved 2026-05-16 (commit `pending commit`). Confirmed: the Responsibilities bullet at line 13 said "using Windows Integrated Authentication", directly contradicting the Authentication section's "No Windows Integrated Authentication ... authenticates directly against LDAP/AD, not via Kerberos/NTLM", CLAUDE.md, and the implementation (`LdapAuthService` performs a direct username/password bind). Documentation-only finding — no regression test is meaningful. Reworded the Responsibilities bullet to "Authenticate users against LDAP/Active Directory using a direct LDAP/AD bind (username/password)", matching the rest of the document and the code. ### Security-011 — Missing tests for security-critical paths | | | |--|--| | Severity | Low | | Category | Testing coverage | | Status | Resolved | | Location | `tests/ZB.MOM.WW.ScadaBridge.Security.Tests/UnitTest1.cs` | **Description** The test suite covers happy paths well but omits several security-relevant cases: no test exercises the StartTLS path (Security-001), the DN-injection / `Contains('=')` fallback inputs (Security-005), JWT validation with a too-short or empty signing key (Security-003), `IsIdleTimedOut` returning true after a token has been refreshed (Security-007), or the `uid`/`cn` mismatch in the no-service-account path (Security-004). The integration `SecurityHardeningTests` only asserts default option values, not enforcement. The test file is still named `UnitTest1.cs`. **Recommendation** Add negative/edge-case tests for the items above, particularly key-length rejection, DN-escaping of hostile usernames, and idle-timeout behavior across a refresh. Rename `UnitTest1.cs` to a descriptive name. **Resolution** Resolved 2026-05-16 (commit `pending commit`). Re-triage: most of the listed gaps were already closed by the Security-001..007 fixes — the regression classes `SecurityReviewRegressionTests` (StartTLS path, JWT empty/short-key rejection) and `SecurityReviewRegressionTests2` (DN-injection / hostile-username escaping, `uid`/`cn` fallback consistency, idle-timeout preserved across refresh) cover them. The finding's reference to a `SecurityHardeningTests` class is stale — no such class exists in the suite. Remaining gaps closed here: renamed the test file `UnitTest1.cs` → `SecurityTests.cs`, and added a new `SecurityReviewRegressionTests3` class with the LDAP-timeout coverage (Security-009) plus extra no-service-account / DN-path edge cases (`BuildFallbackUserDn_NoSearchBase_ReturnsBareRdn`, `EscapeLdapDn_LeadingHash_IsEscaped`, `EscapeLdapDn_NullOrEmpty_ReturnedUnchanged`). Full module suite: 54 tests, all green. ### Security-012 — Partial LDAP failure during login yields a roleless authenticated session | | | |--|--| | Severity | Medium | | Category | Error handling & resilience | | Status | Resolved | | Location | `src/ZB.MOM.WW.ScadaBridge.Security/LdapAuthService.cs:78-118` | **Description** In `AuthenticateAsync`, after the user bind succeeds, the group/attribute `Search` is wrapped in a `try`/`catch (LdapException)` that logs a warning and continues — the comment states "Auth succeeded even if attribute lookup failed". The method then returns `new LdapAuthResult(true, displayName, username, groups, null)` with an **empty `groups`** list. Because `RoleMapper.MapGroupsToRolesAsync` derives all roles from that group list, a login during a *partial* LDAP outage (bind reachable, search/group server unavailable, or a transient `LdapException` mid-search) produces a fully authenticated session with **zero roles** — the user is logged in but silently stripped of all permissions. The design's LDAP Connection Failure rule states new logins must **fail** when LDAP is unavailable; this path instead admits the user. The caller cannot distinguish "user genuinely belongs to no mapped groups" from "the group query failed", so it cannot make the documented fail-the-login decision. The `while` loop's inner `catch (LdapException) { break; }` (line 107-111) compounds this: a real error mid-enumeration is indistinguishable from end-of-results and silently truncates the group list. **Recommendation** Treat a failed group/attribute lookup on the *initial login* as an authentication failure: return `Success = false` with a "directory unavailable, try again" message, or add an explicit flag on `LdapAuthResult` (e.g. `GroupLookupSucceeded`) so the caller can apply the design's fail-the-login rule. The inner `while`-loop `catch` should not treat an arbitrary `LdapException` as a benign end-of-results sentinel — only the specific "no more results" condition should break the loop; any other exception should propagate or be surfaced. **Resolution** Resolved 2026-05-17 (commit `pending`). Confirmed: the group/attribute `Search` was wrapped in a `catch (LdapException)` that logged a warning and returned `new LdapAuthResult(true, …, groups: [])` — a partial LDAP outage produced an authenticated, zero-role session — and the inner `while`-loop `catch { break; }` masked real mid-enumeration errors as end-of-results. `AuthenticateAsync` now tracks a `groupLookupSucceeded` flag (false when any `LdapException` is thrown by the search or `Next()`); the inner swallowing catch was removed so such errors propagate; the new `BuildAuthResultFromGroupLookup` helper returns a FAILED `LdapAuthResult` ("directory temporarily unavailable, please try again") on lookup failure while still treating a genuine empty-groups result as a successful login. Regression tests `BuildAuthResultFromGroupLookup_LookupFailed_FailsTheLogin`, `BuildAuthResultFromGroupLookup_LookupSucceededNoGroups_IsAuthenticated`, `BuildAuthResultFromGroupLookup_LookupSucceededWithGroups_CarriesGroups`. ### Security-013 — `ExtractFirstRdnValue` mis-parses group DNs containing escaped commas | | | |--|--| | Severity | Low | | Category | Correctness & logic bugs | | Status | Resolved | | Location | `src/ZB.MOM.WW.ScadaBridge.Security/LdapAuthService.cs:258-269` | **Description** `ExtractFirstRdnValue` extracts a group name from a `memberOf` DN by taking the substring between the first `=` and the first `,`. RFC 4514 permits a `,` inside an RDN value when it is backslash-escaped — e.g. `cn=Acme\, Inc Operators,ou=groups,dc=example,dc=com`. The method splits on the *escaped* comma and returns `Acme\` as the group name. `RoleMapper` then matches that mangled name verbatim against configured `LdapGroupMapping.LdapGroupName` values, so a group whose CN legitimately contains a comma silently fails to map to its role — the user loses that role with no error. The same naive parsing also does not strip the trailing backslash escape. **Recommendation** Parse the first RDN with an escape-aware scan: ignore a `,` preceded by an unescaped `\`, and unescape RFC 4514 sequences in the extracted value. Alternatively use the LDAP library's DN-parsing API (`LdapDN` / RDN accessors) rather than hand-rolled `IndexOf`. **Resolution** Resolved 2026-05-17 (commit `pending`). Confirmed: `ExtractFirstRdnValue` took the substring between the first `=` and the first `,`, splitting on an RFC 4514 backslash-escaped comma so a CN like `Acme\, Inc Operators` became `Acme\`. Replaced the naive `IndexOf` scan with an escape-aware character scan that ignores a backslash-escaped `,`, unescapes single-character and `\XX` hex escape sequences, and only terminates the value on an unescaped comma — so a comma-containing group CN is returned intact and matches its configured `LdapGroupName`. Method was also made `public` for direct testing. Regression tests `ExtractFirstRdnValue_EscapedComma_KeepsWholeGroupName`, `ExtractFirstRdnValue_PlainDn_ReturnsFirstRdnValue`, `ExtractFirstRdnValue_SingleRdn_ReturnsValue`, `ExtractFirstRdnValue_EscapedSpecials_AreUnescaped`. ### Security-014 — `RefreshToken` re-issues a token without checking the idle timeout | | | |--|--| | Severity | Medium | | Category | Security | | Status | Resolved | | Location | `src/ZB.MOM.WW.ScadaBridge.Security/JwtTokenService.cs:156-169` | **Description** `RefreshToken` carries the existing `LastActivity` claim forward (correct, per the Security-007 fix) and unconditionally issues a fresh token with a new 15-minute expiry. It never calls `IsIdleTimedOut`. The documented 30-minute idle policy therefore depends entirely on the CentralUI request pipeline calling `IsIdleTimedOut` *before* calling `RefreshToken` — two independent, order-dependent checks with no enforcement that they are used together. If a caller refreshes without first checking idle state (an easy mistake, and there is no compiler or API signal preventing it), an idle-expired session is silently renewed: each refresh resets the JWT expiry while `LastActivity` keeps ageing, so the session can be kept alive indefinitely by background refreshes even though the user has been idle for hours. `RefreshToken` is the natural place to enforce the invariant but provides no safety net. **Recommendation** Have `RefreshToken` itself reject a principal that is already past the idle timeout — return `null` (the same "cannot refresh" signal it already uses for missing claims) when `IsIdleTimedOut(currentPrincipal)` is true — so the idle policy holds regardless of caller discipline. Document that an idle-expired principal cannot be refreshed and add a regression test covering refresh of a 31-minute-idle token. **Resolution** Resolved 2026-05-17 (commit `pending`). Confirmed: `RefreshToken` carried `LastActivity` forward and minted a fresh 15-minute token without ever calling `IsIdleTimedOut`, so the idle policy depended entirely on caller discipline and a background refresh could keep an idle-expired session alive indefinitely. `RefreshToken` now calls `IsIdleTimedOut(currentPrincipal)` after the missing-claims check and returns `null` (its existing "cannot refresh" signal) when the session is past the idle window, so the 30-minute idle policy holds regardless of caller order; XML doc updated to state the invariant. Regression tests `RefreshToken_IdleExpiredPrincipal_ReturnsNull`, `RefreshToken_ActiveSessionWithinIdleWindow_StillRefreshes`, `RefreshToken_MissingLastActivityClaim_ReturnsNull`. ### Security-015 — Username is not trimmed before use in the LDAP filter, fallback DN, and JWT claims | | | |--|--| | Severity | Low | | Category | Correctness & logic bugs | | Status | Resolved | | Location | `src/ZB.MOM.WW.ScadaBridge.Security/LdapAuthService.cs:20-21`, `:80`, `:122`, `:169`, `:193` | **Description** `AuthenticateAsync` rejects null/whitespace-only usernames via `IsNullOrWhiteSpace`, but an otherwise-valid username with leading or trailing whitespace (`" alice"`, copy-paste artefacts, mobile keyboards) is passed through verbatim. It flows into the LDAP search filter `({attr}={EscapeLdapFilter(username)})`, into the fallback bind DN via `BuildFallbackUserDn`, and — most consequentially — into the returned `LdapAuthResult` and from there into the JWT `Username` claim. Most LDAP directories ignore surrounding whitespace on a search term, so `" alice"` and `"alice"` may both authenticate but yield JWT `Username` claims that differ by whitespace. Any downstream identity comparison (audit-log `who`, site-scope lookups keyed by username, session de-duplication) then sees two distinct identities for the same person. **Recommendation** Trim the username once at the top of `AuthenticateAsync` (after the `IsNullOrWhiteSpace` guard) and use the trimmed value consistently for the filter, the DN, and the `LdapAuthResult`. This normalises the identity that ends up in the JWT claim and audit trail. **Resolution** Resolved 2026-05-17 (commit `pending`). Confirmed: an otherwise-valid username with leading/trailing whitespace passed the `IsNullOrWhiteSpace` guard and flowed verbatim into the LDAP filter, the fallback DN, and the returned `LdapAuthResult` (and thus the JWT `Username` claim). Added a `NormalizeUsername` helper (trims whitespace) called once at the top of `AuthenticateAsync` immediately after the guard; the local `username` variable is reassigned to the trimmed value so the filter, fallback DN, and result all use the single canonical identity. Regression tests `NormalizeUsername_TrimsLeadingAndTrailingWhitespace`, `BuildFallbackUserDn_TrimmedUsername_NoLeadingTrailingSpace`, `AuthenticateAsync_UsernameWithSurroundingWhitespace_StillRejectedForInsecure`. ### Security-016 — `RoleMapper` silently drops the system-wide Deployment grant when a user is also in any site-scoped Deployment group | | | |--|--| | Severity | Medium | | Category | Correctness & logic bugs | | Status | Resolved | | Location | `src/ZB.MOM.WW.ScadaBridge.Security/RoleMapper.cs:30-31`, `:41-55`, `:59` | **Description** `MapGroupsToRolesAsync` resolves the Deployment role's site scope as a single `isSystemWide = hasDeploymentRole && !hasDeploymentWithScopeRules` flag computed across ALL matched Deployment mappings. If a user is a member of both a system-wide Deployment group (e.g. `SCADA-Deploy-All`, no scope rules) AND a site-scoped Deployment group (e.g. `SCADA-Deploy-SiteA`, one scope rule for Site A), the second mapping sets `hasDeploymentWithScopeRules = true`, so the final `isSystemWide` becomes `false` and the returned `PermittedSiteIds` is just `[SiteA]`. The system-wide grant from `SCADA-Deploy-All` is silently dropped — the user loses access to every other site, even though one of their LDAP groups was intended to grant them system-wide reach. This contradicts the design's "A user can hold multiple roles simultaneously … roles are independent — there is no implied hierarchy" intent: the union of grants should be the broadest grant in the set, not the narrowest. The mistake is also non-obvious to an operator: from the Admin → LDAP Mappings page nothing flags that adding a site-scoped Deployment mapping for a user already in `SCADA-Deploy-All` *removes* sites from their effective grant. The downstream `SiteScopeService.IsSystemWideAsync` / `FilterSitesAsync` faithfully reproduce this narrowing, so the user can no longer see or act on sites outside `[SiteA]`. **Recommendation** Track the union semantics explicitly: if any matched Deployment mapping has no scope rules, the user is system-wide regardless of what other mappings have. The simplest change is to set `hasDeploymentWithScopeRules` only when the mapping has scope rules AND another flag `hasUnscopedDeploymentMapping` is false; then compute `isSystemWide = hasUnscopedDeploymentMapping || (hasDeploymentRole && !hasDeploymentWithScopeRules)`. Equivalently: collect per-mapping `(hasRules, scopedSiteIds)` first, then `isSystemWide = any mapping has hasRules==false`, and `permittedSiteIds = union of all scopedSiteIds` (left empty for system-wide users). Add a regression test `MapGroupsToRoles_UserInBothSystemWideAndScopedDeploymentGroup_IsSystemWide` covering the design's example pair `SCADA-Deploy-All` + `SCADA-Deploy-SiteA`. **Resolution** Resolved 2026-05-28 (commit pending). `RoleMapper.MapGroupsToRolesAsync` now tracks two independent flags per matched Deployment mapping: `hasUnscopedDeploymentMapping` (any matched mapping with no scope rules) and `hasScopedDeploymentMapping` (any matched mapping with scope rules). `isSystemWide` is `hasUnscopedDeploymentMapping || (hasDeploymentRole && !hasScopedDeploymentMapping)` — so a user in both `SCADA-Deploy-All` and `SCADA-Deploy-SiteA` is now correctly system-wide, with the accumulated scope ids cleared. Magic string `"Deployment"` was replaced with the new `Roles.Deployment` constant (Security-018). Regression test `MapGroupsToRoles_UserInBothSystemWideAndScopedDeploymentGroup_IsSystemWide` (seeds Site A, attaches a scope rule to the seeded `SCADA-Deploy-SiteA` mapping, and asserts a user mapped via both `SCADA-Deploy-All` and `SCADA-Deploy-SiteA` resolves to system-wide with empty `PermittedSiteIds`) fails on the pre-fix code and passes after. ### Security-017 — `SiteScopeRequirement` / `SiteScopeAuthorizationHandler` are dead code from production callers — `[Authorize(Policy = RequireDeployment)]` does NOT enforce site scoping | | | |--|--| | Severity | Medium | | Category | Design-document adherence | | Status | Resolved | | Location | `src/ZB.MOM.WW.ScadaBridge.Security/SiteScopeAuthorizationHandler.cs:8-58`; `src/ZB.MOM.WW.ScadaBridge.Security/AuthorizationPolicies.cs:113-143` | **Description** The module declares `SiteScopeRequirement` (an `IAuthorizationRequirement` carrying a `TargetSiteId`) and the matching `SiteScopeAuthorizationHandler` that combines the Deployment role claim with the `SiteId` claims to enforce the design's site-scoping rule. The handler is registered in `AddScadaBridgeAuthorization` (`services.AddSingleton()`). But no `AddPolicy` call ever wires the requirement to a named policy, and a grep across `src/ZB.MOM.WW.ScadaBridge.CentralUI` and `src/ZB.MOM.WW.ScadaBridge.ManagementService` confirms that **no production code ever instantiates `new SiteScopeRequirement(...)` or calls `AuthorizeAsync(...)` with one** — the only callers are the unit tests in `SecurityTests.cs:1146,1166,1185,1203`. The design + CLAUDE.md state that "Deployment and Monitoring pages must filter every site/instance list through `FilterSitesAsync` and re-check `IsSiteAllowedAsync` before any cross-site command", and the CentralUI-028 finding (High, Open) confirms this is exactly the contract two new report pages forgot — because there is no declarative `[Authorize(Policy = ...)]` shortcut, callers must remember to inject `SiteScopeService` and write the check by hand, and any new page that forgets is a silent regression with no compile-time or test-time signal. The module's published surface advertises an authorization-handler pattern that is, in practice, unwired plumbing. **Recommendation** Either (a) **delete** `SiteScopeRequirement` and `SiteScopeAuthorizationHandler` (and the dead `IAuthorizationHandler` registration) and document `SiteScopeService` as the sole site-scoping mechanism — this is the smaller change and matches what the codebase actually does today; or, preferably, (b) **finish the wiring**: add a `RequireSiteScope` policy that uses `SiteScopeRequirement` and provide a small helper / source generator or analyzer that flags Deployment-policy-attributed pages without a site-scope check. Either way, address the cross-module gap: CentralUI-028 stays open until production pages reliably enforce the rule. If (b) is chosen, a route-parameter-aware `IAuthorizationPolicyProvider` is needed so the policy can read the target site id from the request — that is a meaningful design extension and would need to be planned alongside the Central UI's existing `SiteScopeService` usage rather than replacing it piecemeal. **Resolution** Resolved 2026-05-28 (commit pending) per recommendation option (a): deleted `SiteScopeRequirement` and `SiteScopeAuthorizationHandler` outright, along with the unwired `services.AddSingleton()` registration in `AuthorizationPolicies.AddScadaBridgeAuthorization` and the four isolation tests in `SecurityTests.cs`. `SiteScopeService` (CentralUI-002) is now documented as the sole site-scoping mechanism — the stale "mirrors SiteScopeAuthorizationHandler" comment in `SiteScopeService.ResolveAsync` was rewritten to say so. The cross-module partner CentralUI-028 is fixed in the same batch (the two new report pages now consume `SiteScopeService`). ### Security-018 — Role names are hard-coded magic strings duplicated across `RoleMapper`, `SiteScopeAuthorizationHandler`, and `AuthorizationPolicies` | | | |--|--| | Severity | Low | | Category | Code organization & conventions | | Status | Resolved | | Location | `src/ZB.MOM.WW.ScadaBridge.Security/RoleMapper.cs:41`; `src/ZB.MOM.WW.ScadaBridge.Security/SiteScopeAuthorizationHandler.cs:36`; `src/ZB.MOM.WW.ScadaBridge.Security/AuthorizationPolicies.cs:118,121,124,95,107` | **Description** The role-name literals `"Admin"`, `"Design"`, `"Deployment"`, `"Audit"`, and `"AuditReadOnly"` are duplicated as magic strings across three separate files: `RoleMapper.cs:41` hard-codes `"Deployment"` to detect the site-scope branch; `SiteScopeAuthorizationHandler.cs:36` independently hard-codes `"Deployment"` to gate the handler; and `AuthorizationPolicies.cs:118,121,124` hard-code the four role names as the policy `RequireClaim` values. Only the audit roles have a single source of truth (via the `OperationalAuditRoles` / `AuditExportRoles` arrays on `AuthorizationPolicies`). A future rename or addition of a role that misses any one of these call sites silently breaks the system: e.g. renaming "Deployment" → "Deployer" in `RoleMapper` alone would leave the policy still requiring `"Deployment"` (logins get the new role name but the policy never matches), while changing it in the policy alone would leave `RoleMapper` failing to populate scope rules for the renamed role. The bug class is "string drift" — exactly the kind the `OperationalAuditRoles` constant was introduced to prevent. **Recommendation** Introduce a `public static class Roles { public const string Admin = "Admin"; public const string Design = "Design"; public const string Deployment = "Deployment"; public const string Audit = "Audit"; public const string AuditReadOnly = "AuditReadOnly"; }` in the Security project and replace every magic-string occurrence — including the elements of `OperationalAuditRoles` and `AuditExportRoles` — with the constants. A single rename will then either succeed everywhere or fail to compile. **Resolution** Resolved 2026-05-28 (commit pending). Added `src/ZB.MOM.WW.ScadaBridge.Security/Roles.cs` holding `Admin`/`Design`/`Deployment`/`Audit`/`AuditReadOnly` as `public const string` fields. Replaced every magic-string occurrence in this module: `RoleMapper.MapGroupsToRolesAsync` now compares against `Roles.Deployment`; `AuthorizationPolicies.AddScadaBridgeAuthorization` binds `RequireClaim(...)` to `Roles.{Admin,Design,Deployment}`; `OperationalAuditRoles` / `AuditExportRoles` are now built from `Roles.Admin`, `Roles.Audit`, `Roles.AuditReadOnly`. `SiteScopeAuthorizationHandler.cs` was deleted under Security-017, so its `"Deployment"` literal is gone with it. A future rename now propagates by a single edit or fails to compile. ### Security-019 — Service-account rebind failure is reported as "Invalid username or password" — masks misconfiguration as a user-credential error | | | |--|--| | Severity | Low | | Category | Error handling & resilience | | Status | Resolved | | Location | `src/ZB.MOM.WW.ScadaBridge.Security/LdapAuthService.cs:85-89`, `:147-151` | **Description** After the user's credentials bind successfully, `AuthenticateAsync` re-binds as the configured service account to perform the group/attribute search (`connection.Bind(_options.LdapServiceAccountDn, _options.LdapServiceAccountPassword)`). A failure of this second bind — wrong service-account password, deleted/disabled service-account, locked-out service-account — throws `LdapException` which is caught by the broad outer `catch (LdapException)` and returned as `new LdapAuthResult(false, null, username, null, "Invalid username or password.")`. The user sees an "invalid credentials" message for *their* credentials even though their bind succeeded and the failure was in the system's own service-account configuration. Worse, every user attempting to log in sees the same incorrect message during a service-account outage, which routes operators down the wrong incident path (reset the user's password) instead of the right one (check the service-account credentials). The successful user bind itself is also not auditable as a discrete event because the result is "Invalid username or password" — indistinguishable from a genuine bad-password attempt. **Recommendation** Wrap the service-account rebind in its own `try`/`catch (LdapException)` and surface a distinct error: log `_logger.LogError(ex, "Service-account rebind failed; check LdapServiceAccountDn / LdapServiceAccountPassword configuration")` and return `new LdapAuthResult(false, null, username, null, "Authentication service is misconfigured. Contact an administrator.")`. Add a regression test that exercises the service-account-bind failure path (a mocked or seamed `LdapConnection.Bind` that throws on the second call) and asserts the distinct error message. **Resolution** Resolved 2026-05-28 (commit pending). Added a new `ServiceAccountBindException` (`src/ZB.MOM.WW.ScadaBridge.Security/ServiceAccountBindException.cs`) — deliberately NOT an `LdapException` subtype so it short-circuits the generic LDAP catch — and a private `BindServiceAccountAsync` helper on `LdapAuthService` that wraps both service-account rebind sites (the post-user-bind group-lookup rebind AND the `ResolveUserDnAsync` DN-search rebind). On `LdapException`, the helper logs Error ("Service-account rebind failed; check LdapServiceAccountDn / LdapServiceAccountPassword configuration") and rethrows as `ServiceAccountBindException`, which the outer `AuthenticateAsync` catch chain maps to the distinct user-facing message "Authentication service is misconfigured. Contact an administrator." Service-account faults no longer surface as "Invalid username or password". Regression test `ServiceAccountBindException_DoesNotInheritLdapException_SoCatchOrderIsCorrect` pins the load-bearing inheritance contract; a full end-to-end auth test that exercises the service-account-bind failure path is not feasible without an LDAP seam in `LdapAuthService` (the `LdapConnection` is constructed in-method), so the structural test is the closest meaningful unit-level coverage. ### Security-020 — `SecurityOptions` has no startup validation for required fields (`LdapServer`, `LdapSearchBase`) | | | |--|--| | Severity | Low | | Category | Error handling & resilience | | Status | Resolved | | Location | `src/ZB.MOM.WW.ScadaBridge.Security/SecurityOptions.cs:6-7`, `:36-37`; `src/ZB.MOM.WW.ScadaBridge.Security/ServiceCollectionExtensions.cs:13-30` | **Resolution (2026-05-28):** added `SecurityOptionsValidator` (`IValidateOptions`) that rejects empty/whitespace `LdapServer` and `LdapSearchBase` with messages naming the full `Security:Field` key the operator would edit. `AddSecurity` registers it via `services.AddOptions().ValidateOnStart()` + `TryAddEnumerable(... SecurityOptionsValidator)` so a misconfigured `Security` section fails fast at boot rather than minutes later on the first login. `JwtSigningKey` is deliberately left to `JwtTokenService`'s existing length-aware constructor guard (Security-003). Regression tests in `SecurityOptionsValidatorTests`: valid-options succeed; empty/whitespace `LdapServer` and `LdapSearchBase` each fail with the key-naming message (theory); both-empty reports both keys; `AddSecurity_RegistersSecurityOptionsValidator` pins the DI wiring. **Description** `SecurityOptions.JwtSigningKey` correctly fails fast at `JwtTokenService` construction (Security-003 fix), but the LDAP-side required fields — `LdapServer` (default `string.Empty`) and `LdapSearchBase` (default `string.Empty`) — have no equivalent guard. `AddSecurity` does not register an `IValidateOptions`. A deployment that fails to set `LdapServer` (a typo in the appsettings.json section name, a missing environment-variable substitution, a misconfigured Docker compose file) starts cleanly — the Central UI comes up, the login page loads, and only the first authentication attempt fails with `LdapConnection.Connect("")` throwing a low-level exception that bubbles up as the generic "An unexpected error occurred during authentication." message. The misconfiguration surfaces minutes or hours into the deploy, on the first real user login, rather than at startup where it is cheap to diagnose. **Recommendation** Add an `IValidateOptions` registered via `services.AddOptions().ValidateOnStart()` that fails when `LdapServer` is null/whitespace, `LdapSearchBase` is null/whitespace, or `LdapPort <= 0`. Combine with the existing `JwtTokenService` constructor check so every required `SecurityOptions` field is enforced at startup, not at first use. ### Security-021 — `RequireHttpsCookie=false` dev opt-out has no warning path — an HTTP production deployment silently transmits the JWT bearer credential in cleartext | | | |--|--| | Severity | Low | | Category | Security | | Status | Resolved | | Location | `src/ZB.MOM.WW.ScadaBridge.Security/SecurityOptions.cs:100-108`; `src/ZB.MOM.WW.ScadaBridge.Security/ServiceCollectionExtensions.cs:54-59` | **Resolution (2026-05-28):** Added `ILoggerFactory` to the cookie-options `Configure` callback in `AddSecurity` so an explicit `RequireHttpsCookie=false` opt-out now emits a startup `LogWarning` ("auth cookie SecurePolicy is SameAsRequest. The cookie-embedded JWT will be transmitted in cleartext over plain HTTP. This setting is intended for local dev only — set SecurityOptions:RequireHttpsCookie=true in production."). Default is still `true`, so production deployments unchanged. The "fail startup when RequireHttpsCookie=false AND ASPNETCORE_ENVIRONMENT=Production" hard-stop option was not implemented (the dev Docker cluster intentionally runs with the flag false, and the env-var name varies across deploy mechanisms); the warning is the right ergonomic floor. **Description** The Security-002 fix added `RequireHttpsCookie` (default `true`) so the auth cookie's `SecurePolicy` is `Always` in production. The current Docker dev cluster sets `RequireHttpsCookie=false` in both central nodes' `appsettings.Central.json`, downgrading to `SameAsRequest` so the local HTTP cluster works. The downgrade is documented in the XML doc but is silent at runtime: no log line warns that the cookie carrying the JWT bearer credential is being sent over an HTTP-only path. A production deployment that inherits a dev-derived appsettings — or that copy-pastes the docker config and forgets to flip the flag — transmits the session token in cleartext with no diagnostic signal. The default is correct; the gap is that the unsafe override has no operational guard. **Recommendation** In the `PostConfigure` block in `AddSecurity`, when `RequireHttpsCookie == false`, log a single startup warning along the lines of `_logger.LogWarning("RequireHttpsCookie is DISABLED — auth cookie SecurePolicy is SameAsRequest. The cookie-embedded JWT will be transmitted over plain HTTP. This setting is intended for local dev only — set SecurityOptions:RequireHttpsCookie=true in production.")`. Optionally, also fail startup when `RequireHttpsCookie=false` AND `ASPNETCORE_ENVIRONMENT=Production`. Add a regression test that asserts the warning is emitted when the flag is disabled and not when it is enabled.