Files
scadalink-design/code-reviews/Security/findings.md

657 lines
34 KiB
Markdown

# Code Review — Security
| Field | Value |
|-------|-------|
| Module | `src/ScadaLink.Security` |
| Design doc | `docs/requirements/Component-Security.md` |
| Status | Reviewed |
| Last reviewed | 2026-05-17 |
| Reviewer | claude-agent |
| Commit reviewed | `39d737e` |
| Open findings | 0 (1 deferred — Security-008) |
## Summary
The Security module is small and reasonably structured: a stateless `LdapAuthService`
for search-then-bind authentication, a `JwtTokenService` for HMAC-signed cookie tokens,
a `RoleMapper` that resolves LDAP groups to roles, and ASP.NET Core authorization
policies plus a site-scope handler. Unit-test coverage of the happy paths is decent.
However, the review surfaced several real security weaknesses, the most serious being
that **StartTLS is dead code** (the design's "LDAPS or StartTLS" requirement is only
half met), that **the authentication cookie is not marked `Secure`** despite the design
mandating it, and that **the JWT signing key is never length-validated** so a weak or
empty key is silently accepted. There is also a genuine **DN-injection** gap in the
no-service-account fallback path, a filter/DN attribute mismatch (`uid=` vs `cn=`) that
makes that fallback path internally inconsistent, and an N+1 query in `RoleMapper`.
JWT validation also disables issuer/audience checks and the idle-timeout claim is reset
on every refresh, weakening the documented 30-minute idle policy. None of these are
crash/data-loss bugs, but the TLS, cookie, and key-validation items are security
defects that should be fixed before any production deployment.
#### Re-review 2026-05-17 (commit `39d737e`)
Re-reviewed the module after the batch-resolution of Security-001..007, 009, 010, 011.
Those fixes were verified in place and remain Resolved: the `LdapTransport` enum makes
StartTLS reachable, the auth cookie is `SecurePolicy.Always` with a sliding window, the
JWT signing key is length-validated at construction, issuer/audience are bound and
checked, the DN-injection fallback is RFC 4514-escaped, and the idle-timeout anchor is
carried across refreshes. Security-008 (N+1 scope-rule query) remains correctly
**Deferred**`ISecurityRepository` still exposes no per-set scope-rule query, so the
N+1 cannot be removed from within `RoleMapper.cs` alone; the deferral condition is
unchanged. This pass surfaced **4 new findings** (Security-012..015): two Medium and
two Low. The most notable is that a *partial* LDAP outage during login (bind succeeds,
group search fails) silently returns an authenticated session with **zero roles**
instead of failing the login as the design's LDAP-failure rule requires
(Security-012); and that `RefreshToken` re-issues a token without ever checking
`IsIdleTimedOut`, so an idle-expired session can be renewed indefinitely if the caller
omits the separate idle check (Security-014). The two Low findings concern fragile
DN parsing of group names containing escaped commas and an un-trimmed username flowing
into the LDAP filter, fallback DN, and JWT claims.
## Checklist coverage
| # | Category | Examined | Notes |
|---|----------|----------|-------|
| 1 | Correctness & logic bugs | ☑ | `uid=`/`cn=` attribute mismatch between search filter and fallback DN construction (Security-004); StartTLS branch is unreachable (Security-001). |
| 2 | Akka.NET conventions | ☑ | No actors in this module — `AddSecurityActors` is an empty placeholder. Nothing to assess. |
| 3 | Concurrency & thread safety | ☑ | Services are stateless and DI-scoped; LDAP sync calls wrapped in `Task.Run`. No shared mutable state. No issues found. |
| 4 | Error handling & resilience | ☑ | LDAP failure paths return structured `LdapAuthResult`; group-lookup failure is tolerated per design. `ct` not honored inside `Task.Run` bodies (Security-009). |
| 5 | Security | ☑ | StartTLS dead code (Security-001), cookie not `Secure` (Security-002), JWT key unvalidated (Security-003), DN injection (Security-005), no issuer/audience validation (Security-006), idle-timeout reset on refresh (Security-007). |
| 6 | Performance & resource management | ☑ | N+1 scope-rule query in `RoleMapper` (Security-008). `LdapConnection` correctly disposed via `using`. |
| 7 | Design-document adherence | ☑ | StartTLS unsupported and Secure cookie missing both contradict the design doc; design also says "Windows Integrated Authentication" in Responsibilities, contradicting its own Authentication section (Security-010). |
| 8 | Code organization & conventions | ☑ | `SecurityOptions` correctly owned by the component; repository interface in Commons. No issues found. |
| 9 | Testing coverage | ☑ | No tests for `RoleMapper` N+1 behavior, DN-injection inputs, StartTLS path, or idle-timeout-after-refresh. Insecure-config combinations under-tested (Security-011). |
| 10 | Documentation & comments | ☑ | `SecurityOptions` XML docs say direct bind uses `cn={username}` while the search filter uses `uid=` — comment is misleading (covered under Security-004). |
## Findings
### Security-001 — StartTLS upgrade path is unreachable dead code
| | |
|--|--|
| Severity | High |
| Category | Security |
| Status | Resolved |
| Location | `src/ScadaLink.Security/LdapAuthService.cs:37-47` |
**Description**
When `LdapUseTls` is true the code sets `connection.SecureSocketLayer = true` (LDAPS).
The subsequent StartTLS block is guarded by `if (_options.LdapUseTls && !connection.SecureSocketLayer)`.
Because `SecureSocketLayer` was just set to `true`, the second condition `!connection.SecureSocketLayer`
is always false, so `connection.StartTls()` is never called. The design doc explicitly
states LDAP connections must use **"LDAPS (port 636) or StartTLS"** — StartTLS is in
practice unsupported. A deployment that intends to use StartTLS on port 389 would get a
plaintext LDAPS-mode connection attempt that fails, or worse, an operator may disable
TLS entirely to make it work, sending credentials in cleartext.
**Recommendation**
Introduce an explicit transport mode (e.g. `LdapTransport { Ldaps, StartTls, None }`)
or a separate `LdapUseStartTls` flag. For StartTLS, leave `SecureSocketLayer` false,
call `connection.Connect`, then call `connection.StartTls()` and verify the negotiated
session is encrypted before binding. Remove the unreachable conditional.
**Resolution**
Resolved 2026-05-16 (commit `<pending>`). Added an explicit `LdapTransport` enum
(`Ldaps`/`StartTls`/`None`); `SecureSocketLayer` is set only for LDAPS, and the
StartTLS branch now connects in plaintext, calls `StartTls()`, and verifies
`connection.Tls` before binding. `LdapUseTls` is retained as a compatibility shim
mapping onto the enum. Regression tests `AuthenticateAsync_StartTlsTransport_AttemptsConnection`
and `AuthenticateAsync_NoTlsTransport_RejectedWithoutAllowInsecure`.
### Security-002 — Authentication cookie is not marked `Secure`
| | |
|--|--|
| Severity | High |
| Category | Security |
| Status | Resolved |
| Location | `src/ScadaLink.Security/ServiceCollectionExtensions.cs:16-23` |
**Description**
`AddCookie` sets `HttpOnly = true` and `SameSite = Strict` but never sets
`options.Cookie.SecurePolicy`. The ASP.NET Core default is `CookieSecurePolicy.SameAsRequest`,
which permits the cookie (carrying the embedded JWT — a bearer credential) to be sent
over plain HTTP. The design doc states the cookie is **"HttpOnly and Secure (requires
HTTPS)"**. As written, the module does not enforce that requirement; a misconfigured or
HTTP-fronted deployment would transmit the session token in cleartext.
**Recommendation**
Set `options.Cookie.SecurePolicy = CookieSecurePolicy.Always` in `AddCookie`. Consider
also setting `ExpireTimeSpan` and `SlidingExpiration` to align the cookie lifetime with
the documented 15-minute JWT / 30-minute idle policy.
**Resolution**
Resolved 2026-05-16 (commit `<pending>`). Confirmed the cookie is configured in this
module (`ServiceCollectionExtensions.AddSecurity`), not CentralUI — no misattribution.
Added `options.Cookie.SecurePolicy = CookieSecurePolicy.Always` so the JWT-bearing
cookie is never sent over plain HTTP. Regression test
`AddSecurity_AuthCookie_IsMarkedSecureAlways`. (`ExpireTimeSpan`/`SlidingExpiration`
tuning left as a separate, lower-priority improvement.)
### Security-003 — JWT signing key length is never validated
| | |
|--|--|
| Severity | High |
| Category | Security |
| Status | Resolved |
| Location | `src/ScadaLink.Security/JwtTokenService.cs:33`, `src/ScadaLink.Security/SecurityOptions.cs:42` |
**Description**
`SecurityOptions.JwtSigningKey` defaults to `string.Empty` and is fed directly into
`new SymmetricSecurityKey(Encoding.UTF8.GetBytes(_options.JwtSigningKey))` with no
validation. HMAC-SHA256 requires a key of at least 256 bits (32 bytes); a short or empty
key produces a trivially forgeable token. The `SecurityHardeningTests` comment claims a
minimum length is "enforced", but no code in this module enforces it — the test only
asserts that a 32+ char key works. A deployment with a missing or short `JwtSigningKey`
would start successfully and issue weakly-signed tokens.
**Recommendation**
Validate `JwtSigningKey` at startup — fail fast if it is empty or shorter than 32 bytes.
Use an `IValidateOptions<SecurityOptions>` validator or guard in the `JwtTokenService`
constructor so a weak key is rejected before any token is issued.
**Resolution**
Resolved 2026-05-16 (commit `<pending>`). The `JwtTokenService` constructor now
fails fast with an `InvalidOperationException` when `JwtSigningKey` is empty or
shorter than 32 bytes (`SecurityOptions.MinJwtSigningKeyBytes`), so a weak key is
rejected before any token is issued. The misleading `SecurityOptions` XML doc was
corrected to state the requirement. Regression tests
`JwtTokenService_EmptySigningKey_ThrowsAtConstruction`,
`JwtTokenService_ShortSigningKey_ThrowsAtConstruction`, and
`JwtTokenService_AdequateSigningKey_ConstructsSuccessfully`.
### Security-004 — Search filter uses `uid=` while fallback DN construction uses `cn=`
| | |
|--|--|
| Severity | Medium |
| Category | Correctness & logic bugs |
| Status | Resolved |
| Location | `src/ScadaLink.Security/LdapAuthService.cs:66`, `:138`, `:157-159` |
**Description**
`AuthenticateAsync` and `ResolveUserDnAsync` build the search filter as
`(uid={username})`, but the no-service-account fallback in `ResolveUserDnAsync`
constructs the bind DN as `cn={username},{LdapSearchBase}`. The `SecurityOptions.LdapServiceAccountDn`
XML comment also documents the fallback as `cn={username},{LdapSearchBase}`. A directory
keyed on `uid` will succeed via search-then-bind but fail via the direct-bind fallback
(and vice versa). The attribute used for lookup is hard-coded and inconsistent across
the two code paths, so the two configuration modes are not interchangeable.
**Recommendation**
Introduce a single configurable `LdapUserIdAttribute` (default `uid`) and use it
consistently in both the search filter and the fallback DN. Update the XML doc to match.
**Resolution**
Resolved 2026-05-16 (commit `pending`). Confirmed: the search filter was hard-coded
`(uid={username})` (both in `AuthenticateAsync` and `ResolveUserDnAsync`) while the
fallback DN used `cn={username}` — the two auth modes were not interchangeable. Added
a configurable `SecurityOptions.LdapUserIdAttribute` (default `uid`) used for both the
search filter and the fallback DN via the new `BuildFallbackUserDn` helper, and
corrected the `LdapServiceAccountDn` XML doc to reference `{LdapUserIdAttribute}`.
Regression tests `BuildFallbackUserDn_UsesConfiguredUserIdAttribute`,
`BuildFallbackUserDn_HonoursNonDefaultUserIdAttribute`,
`SecurityOptions_LdapUserIdAttribute_DefaultsToUid`.
### Security-005 — DN injection in the no-service-account bind fallback
| | |
|--|--|
| Severity | Medium |
| Category | Security |
| Status | Resolved |
| Location | `src/ScadaLink.Security/LdapAuthService.cs:157-159` |
**Description**
When no service account is configured, the user-supplied `username` is interpolated
directly into a distinguished name: `$"cn={username},{LdapSearchBase}"`. `EscapeLdapFilter`
escapes *search-filter* metacharacters, but DN construction requires a different
escaping scheme (RFC 4514 — `,`, `+`, `"`, `\`, `<`, `>`, `;`, leading/trailing spaces).
No DN escaping is applied here. A username such as `victim,ou=admins` alters the DN
structure, allowing a caller to attempt a bind as a different DN than intended. Combined
with the `username.Contains('=')` shortcut at line 129 — which lets a caller supply a
full arbitrary DN — the fallback path gives the client undue control over the bind
identity.
**Recommendation**
Apply RFC 4514 DN-component escaping to `username` before interpolation, or use the
LDAP library's DN-builder API. Reconsider the `Contains('=')` shortcut — accepting a
raw DN from untrusted input is risky; restrict it or remove it.
**Resolution**
Resolved 2026-05-16 (commit `pending`). Confirmed: the fallback path interpolated
the raw `username` straight into `cn={username},{LdapSearchBase}` with no DN escaping,
and the `username.Contains('=')` shortcut let a caller supply an arbitrary bind DN.
Added an RFC 4514 `EscapeLdapDn` helper (escapes `, + " \ < > ;`, leading/trailing
space, leading `#`, NUL) applied in `BuildFallbackUserDn`, so a username such as
`victim,ou=admins` can no longer alter the DN structure. The `Contains('=')` raw-DN
shortcut was removed entirely — untrusted input no longer controls the bind identity.
Regression tests `BuildFallbackUserDn_EscapesDnMetacharacters`,
`EscapeLdapDn_EscapesAllRfc4514Specials`, `EscapeLdapDn_EscapesLeadingAndTrailingSpaces`.
### Security-006 — JWT validation disables issuer and audience checks
| | |
|--|--|
| Severity | Medium |
| Category | Security |
| Status | Resolved |
| Location | `src/ScadaLink.Security/JwtTokenService.cs:67-75`, `:56-59` |
**Description**
`ValidateToken` sets `ValidateIssuer = false` and `ValidateAudience = false`, and
`GenerateToken` never sets an `iss` or `aud`. With a shared symmetric HMAC key, any
other system or component that signs JWTs with the same key would produce tokens this
service accepts. While the design states the key is shared only between the two central
nodes, omitting issuer/audience binding removes a cheap defense-in-depth control and
makes accidental key reuse (e.g. the same secret used for another internal token)
silently exploitable.
**Recommendation**
Set a fixed `Issuer` and `Audience` (e.g. `"scadalink-central"`) when generating tokens
and enable `ValidateIssuer`/`ValidateAudience` with the matching expected values during
validation.
**Resolution**
Resolved 2026-05-16 (commit `pending`). Confirmed: `GenerateToken` set neither `iss`
nor `aud` and `ValidateToken` had `ValidateIssuer = false`/`ValidateAudience = false`.
`GenerateToken` now binds `JwtTokenService.TokenIssuer`/`TokenAudience`
(both `"scadalink-central"`) into every token, and `ValidateToken` enables
`ValidateIssuer`/`ValidateAudience` against those fixed values — a token signed with
the shared key but a foreign issuer is now rejected. Regression tests
`GenerateToken_SetsIssuerAndAudience`, `ValidateToken_RejectsTokenWithWrongIssuer`,
`ValidateToken_AcceptsOwnIssuerAndAudience`.
### Security-007 — Idle-timeout claim is reset on every token refresh
| | |
|--|--|
| Severity | Medium |
| Category | Correctness & logic bugs |
| Status | Resolved |
| Location | `src/ScadaLink.Security/JwtTokenService.cs:40`, `:111-123` |
**Description**
The design states the 30-minute idle timeout is tracked via a "last-activity timestamp
in the token", and `IsIdleTimedOut` reads the `LastActivity` claim. But `RefreshToken`
calls `GenerateToken`, which unconditionally writes `LastActivity = DateTimeOffset.UtcNow`.
Token refresh fires whenever a request arrives within ~5 minutes of expiry. The result
is that `LastActivity` reflects *token issuance time*, not genuine user activity — and
since refresh itself is a request, the timestamp keeps moving forward. A more subtle
consequence: the idle window is effectively measured from the last refresh, not the
last real interaction, so the documented "no requests within the idle window" semantics
are not faithfully implemented. The claim name `LastActivity` is also misleading.
**Recommendation**
Decide explicitly how activity is tracked. Either (a) carry the original `LastActivity`
forward across refreshes and update it only on real request handling in the middleware,
or (b) rename the claim to `IssuedAt`/`TokenCreated` and document that the idle window
is measured from issuance. Whichever is chosen, ensure `IsIdleTimedOut` and the refresh
path agree on the semantics.
**Resolution**
Resolved 2026-05-16 (commit `pending`). Confirmed: `RefreshToken``GenerateToken`
unconditionally wrote `LastActivity = UtcNow`, so the idle clock reset on every
refresh and the documented 30-minute idle timeout could never fire for a client that
polls in the background. Implemented option (a) — the Security-side half of the
documented "15-min sliding + 30-min idle" policy (the cross-module partner of
CentralUI-005): `GenerateToken` now takes an optional `lastActivity` anchor;
`RefreshToken` carries the **existing** `LastActivity` claim forward unchanged, and a
new explicit `RecordActivity` method advances the anchor to now — to be called by the
CentralUI request pipeline on genuine user interaction (not on a background refresh).
`IsIdleTimedOut` is unchanged and now agrees with the refresh path. The remaining
CentralUI-side wiring (calling `RecordActivity` from the middleware, setting
`SlidingExpiration`/`ExpireTimeSpan`) stays tracked under CentralUI-005; this finding's
Security-side defect — the reset-on-refresh bug — is fully fixed here. Regression tests
`RefreshToken_PreservesOriginalLastActivityClaim`,
`RefreshToken_DoesNotResetIdleTimeoutWhenUserIsActuallyIdle`,
`RecordActivity_UpdatesLastActivityToNow`.
### Security-008 — N+1 query loading site-scope rules in `RoleMapper`
| | |
|--|--|
| Severity | Low |
| Category | Performance & resource management |
| Status | Deferred |
| Location | `src/ScadaLink.Security/RoleMapper.cs:25-48` |
**Description**
`MapGroupsToRolesAsync` first calls `GetAllMappingsAsync`, then inside the per-mapping
loop calls `GetScopeRulesForMappingAsync(mapping.Id, ct)` once for every matched
Deployment mapping. This is an N+1 query pattern executed on the login hot path and on
every 15-minute token refresh. With multiple site-scoped Deployment groups it issues a
round-trip per group.
**Recommendation**
Add a repository method that loads scope rules for a set of mapping IDs in one query
(or eager-loads them with the mappings), and resolve all scope rules with a single call.
**Resolution**
Deferred 2026-05-16 (commit `pending commit`). Finding confirmed accurate: `RoleMapper.MapGroupsToRolesAsync`
issues one `GetScopeRulesForMappingAsync` round-trip per matched Deployment mapping — a
genuine N+1 on the login / 15-minute-refresh path. However, the only correct fix
(a batch `GetScopeRulesForMappingsAsync(IEnumerable<int>)` repository method, or an
eager-load navigation property) requires changes to `ISecurityRepository`
(`src/ScadaLink.Commons`) and `SecurityRepository` (`src/ScadaLink.ConfigurationDatabase`).
Both are outside the Security module's permitted edit scope for this review pass, and the
existing `ISecurityRepository` surface offers no per-set scope-rule query, so the N+1
cannot be removed from within `RoleMapper.cs` alone. Severity is Low (bounded by the
number of site-scoped Deployment groups, typically small). Deferred to a cross-module
change that adds the batch repository method; `RoleMapper` should then resolve all scope
rules in a single call.
### Security-009 — CancellationToken not honored inside `Task.Run` LDAP calls
| | |
|--|--|
| Severity | Low |
| Category | Error handling & resilience |
| Status | Resolved |
| Location | `src/ScadaLink.Security/LdapAuthService.cs:42`, `:46`, `:51`, `:56-57`, `:67-73`, `:135`, `:139-145` |
**Description**
The synchronous Novell LDAP calls are wrapped in `Task.Run(() => ..., ct)`. The `ct`
argument only prevents the work item from *starting* if cancellation is already
signaled; once a `connection.Connect`/`Bind`/`Search` call is in progress it cannot be
cancelled. A cancelled or timed-out login request will continue to occupy a thread-pool
thread and an LDAP connection until the blocking call returns on its own. There is also
no explicit network/operation timeout configured on the `LdapConnection`.
**Recommendation**
Configure `LdapConnection.ConnectionTimeout` and search/operation time limits so a
hung LDAP server cannot pin a thread indefinitely. Document that `ct` only guards
work-item scheduling, or implement a timeout-with-disconnect fallback.
**Resolution**
Resolved 2026-05-16 (commit `pending commit`). Confirmed: the synchronous Novell LDAP
calls were wrapped in `Task.Run(..., ct)` where `ct` only prevents the work item from
starting and cannot interrupt an in-progress blocking `Connect`/`Bind`/`Search`, and no
network/operation timeout was configured on the `LdapConnection`. Added a configurable
`SecurityOptions.LdapConnectionTimeoutMs` (default 10s) and a new `ApplyConnectionTimeout`
helper that sets both `LdapConnection.ConnectionTimeout` (socket connect) and
`LdapConstraints.TimeLimit` (per-operation limit) before connecting, so a hung LDAP
server can no longer pin a thread-pool thread indefinitely. The `ct`-only-guards-scheduling
limitation is now documented in the option's XML doc and an inline comment. Regression
tests `SecurityOptions_LdapConnectionTimeout_HasSaneDefault` and
`AuthenticateAsync_UnreachableHost_FailsWithinConfiguredTimeout`.
### Security-010 — Design doc contradicts itself on Windows Integrated Authentication
| | |
|--|--|
| Severity | Low |
| Category | Design-document adherence |
| Status | Resolved |
| Location | `docs/requirements/Component-Security.md:13` (vs. `:23`) |
**Description**
The Responsibilities section states the component authenticates "using Windows
Integrated Authentication", but the Authentication section (line 23) and CLAUDE.md
explicitly state **"No Windows Integrated Authentication ... authenticates directly
against LDAP/AD, not via Kerberos/NTLM"** — which is what the code actually does
(direct LDAP bind). The Responsibilities line is stale and contradicts both the rest of
the doc and the implementation.
**Recommendation**
Fix `Component-Security.md:13` to say "using a direct LDAP/Active Directory bind"
to match the implemented behavior and the rest of the document.
**Resolution**
Resolved 2026-05-16 (commit `pending commit`). Confirmed: the Responsibilities bullet at
line 13 said "using Windows Integrated Authentication", directly contradicting the
Authentication section's "No Windows Integrated Authentication ... authenticates directly
against LDAP/AD, not via Kerberos/NTLM", CLAUDE.md, and the implementation (`LdapAuthService`
performs a direct username/password bind). Documentation-only finding — no regression test
is meaningful. Reworded the Responsibilities bullet to "Authenticate users against
LDAP/Active Directory using a direct LDAP/AD bind (username/password)", matching the rest
of the document and the code.
### Security-011 — Missing tests for security-critical paths
| | |
|--|--|
| Severity | Low |
| Category | Testing coverage |
| Status | Resolved |
| Location | `tests/ScadaLink.Security.Tests/UnitTest1.cs` |
**Description**
The test suite covers happy paths well but omits several security-relevant cases:
no test exercises the StartTLS path (Security-001), the DN-injection / `Contains('=')`
fallback inputs (Security-005), JWT validation with a too-short or empty signing key
(Security-003), `IsIdleTimedOut` returning true after a token has been refreshed
(Security-007), or the `uid`/`cn` mismatch in the no-service-account path (Security-004).
The integration `SecurityHardeningTests` only asserts default option values, not
enforcement. The test file is still named `UnitTest1.cs`.
**Recommendation**
Add negative/edge-case tests for the items above, particularly key-length rejection,
DN-escaping of hostile usernames, and idle-timeout behavior across a refresh. Rename
`UnitTest1.cs` to a descriptive name.
**Resolution**
Resolved 2026-05-16 (commit `pending commit`). Re-triage: most of the listed gaps were
already closed by the Security-001..007 fixes — the regression classes
`SecurityReviewRegressionTests` (StartTLS path, JWT empty/short-key rejection) and
`SecurityReviewRegressionTests2` (DN-injection / hostile-username escaping, `uid`/`cn`
fallback consistency, idle-timeout preserved across refresh) cover them. The finding's
reference to a `SecurityHardeningTests` class is stale — no such class exists in the
suite. Remaining gaps closed here: renamed the test file `UnitTest1.cs`
`SecurityTests.cs`, and added a new `SecurityReviewRegressionTests3` class with the
LDAP-timeout coverage (Security-009) plus extra no-service-account / DN-path edge cases
(`BuildFallbackUserDn_NoSearchBase_ReturnsBareRdn`, `EscapeLdapDn_LeadingHash_IsEscaped`,
`EscapeLdapDn_NullOrEmpty_ReturnedUnchanged`). Full module suite: 54 tests, all green.
### Security-012 — Partial LDAP failure during login yields a roleless authenticated session
| | |
|--|--|
| Severity | Medium |
| Category | Error handling & resilience |
| Status | Resolved |
| Location | `src/ScadaLink.Security/LdapAuthService.cs:78-118` |
**Description**
In `AuthenticateAsync`, after the user bind succeeds, the group/attribute `Search` is
wrapped in a `try`/`catch (LdapException)` that logs a warning and continues — the
comment states "Auth succeeded even if attribute lookup failed". The method then returns
`new LdapAuthResult(true, displayName, username, groups, null)` with an **empty
`groups`** list. Because `RoleMapper.MapGroupsToRolesAsync` derives all roles from that
group list, a login during a *partial* LDAP outage (bind reachable, search/group server
unavailable, or a transient `LdapException` mid-search) produces a fully authenticated
session with **zero roles** — the user is logged in but silently stripped of all
permissions. The design's LDAP Connection Failure rule states new logins must **fail**
when LDAP is unavailable; this path instead admits the user. The caller cannot
distinguish "user genuinely belongs to no mapped groups" from "the group query failed",
so it cannot make the documented fail-the-login decision. The `while` loop's inner
`catch (LdapException) { break; }` (line 107-111) compounds this: a real error mid-enumeration
is indistinguishable from end-of-results and silently truncates the group list.
**Recommendation**
Treat a failed group/attribute lookup on the *initial login* as an authentication
failure: return `Success = false` with a "directory unavailable, try again" message, or
add an explicit flag on `LdapAuthResult` (e.g. `GroupLookupSucceeded`) so the caller can
apply the design's fail-the-login rule. The inner `while`-loop `catch` should not treat
an arbitrary `LdapException` as a benign end-of-results sentinel — only the specific
"no more results" condition should break the loop; any other exception should propagate
or be surfaced.
**Resolution**
Resolved 2026-05-17 (commit `pending`). Confirmed: the group/attribute `Search` was
wrapped in a `catch (LdapException)` that logged a warning and returned
`new LdapAuthResult(true, …, groups: [])` — a partial LDAP outage produced an
authenticated, zero-role session — and the inner `while`-loop `catch { break; }` masked
real mid-enumeration errors as end-of-results. `AuthenticateAsync` now tracks a
`groupLookupSucceeded` flag (false when any `LdapException` is thrown by the search or
`Next()`); the inner swallowing catch was removed so such errors propagate; the new
`BuildAuthResultFromGroupLookup` helper returns a FAILED `LdapAuthResult`
("directory temporarily unavailable, please try again") on lookup failure while still
treating a genuine empty-groups result as a successful login. Regression tests
`BuildAuthResultFromGroupLookup_LookupFailed_FailsTheLogin`,
`BuildAuthResultFromGroupLookup_LookupSucceededNoGroups_IsAuthenticated`,
`BuildAuthResultFromGroupLookup_LookupSucceededWithGroups_CarriesGroups`.
### Security-013 — `ExtractFirstRdnValue` mis-parses group DNs containing escaped commas
| | |
|--|--|
| Severity | Low |
| Category | Correctness & logic bugs |
| Status | Resolved |
| Location | `src/ScadaLink.Security/LdapAuthService.cs:258-269` |
**Description**
`ExtractFirstRdnValue` extracts a group name from a `memberOf` DN by taking the substring
between the first `=` and the first `,`. RFC 4514 permits a `,` inside an RDN value when
it is backslash-escaped — e.g. `cn=Acme\, Inc Operators,ou=groups,dc=example,dc=com`. The
method splits on the *escaped* comma and returns `Acme\` as the group name. `RoleMapper`
then matches that mangled name verbatim against configured `LdapGroupMapping.LdapGroupName`
values, so a group whose CN legitimately contains a comma silently fails to map to its
role — the user loses that role with no error. The same naive parsing also does not strip
the trailing backslash escape.
**Recommendation**
Parse the first RDN with an escape-aware scan: ignore a `,` preceded by an unescaped
`\`, and unescape RFC 4514 sequences in the extracted value. Alternatively use the LDAP
library's DN-parsing API (`LdapDN` / RDN accessors) rather than hand-rolled `IndexOf`.
**Resolution**
Resolved 2026-05-17 (commit `pending`). Confirmed: `ExtractFirstRdnValue` took the
substring between the first `=` and the first `,`, splitting on an RFC 4514
backslash-escaped comma so a CN like `Acme\, Inc Operators` became `Acme\`. Replaced the
naive `IndexOf` scan with an escape-aware character scan that ignores a backslash-escaped
`,`, unescapes single-character and `\XX` hex escape sequences, and only terminates the
value on an unescaped comma — so a comma-containing group CN is returned intact and
matches its configured `LdapGroupName`. Method was also made `public` for direct
testing. Regression tests `ExtractFirstRdnValue_EscapedComma_KeepsWholeGroupName`,
`ExtractFirstRdnValue_PlainDn_ReturnsFirstRdnValue`,
`ExtractFirstRdnValue_SingleRdn_ReturnsValue`,
`ExtractFirstRdnValue_EscapedSpecials_AreUnescaped`.
### Security-014 — `RefreshToken` re-issues a token without checking the idle timeout
| | |
|--|--|
| Severity | Medium |
| Category | Security |
| Status | Resolved |
| Location | `src/ScadaLink.Security/JwtTokenService.cs:156-169` |
**Description**
`RefreshToken` carries the existing `LastActivity` claim forward (correct, per the
Security-007 fix) and unconditionally issues a fresh token with a new 15-minute expiry.
It never calls `IsIdleTimedOut`. The documented 30-minute idle policy therefore depends
entirely on the CentralUI request pipeline calling `IsIdleTimedOut` *before* calling
`RefreshToken` — two independent, order-dependent checks with no enforcement that they
are used together. If a caller refreshes without first checking idle state (an easy
mistake, and there is no compiler or API signal preventing it), an idle-expired session
is silently renewed: each refresh resets the JWT expiry while `LastActivity` keeps
ageing, so the session can be kept alive indefinitely by background refreshes even
though the user has been idle for hours. `RefreshToken` is the natural place to enforce
the invariant but provides no safety net.
**Recommendation**
Have `RefreshToken` itself reject a principal that is already past the idle timeout —
return `null` (the same "cannot refresh" signal it already uses for missing claims) when
`IsIdleTimedOut(currentPrincipal)` is true — so the idle policy holds regardless of
caller discipline. Document that an idle-expired principal cannot be refreshed and add a
regression test covering refresh of a 31-minute-idle token.
**Resolution**
Resolved 2026-05-17 (commit `pending`). Confirmed: `RefreshToken` carried `LastActivity`
forward and minted a fresh 15-minute token without ever calling `IsIdleTimedOut`, so the
idle policy depended entirely on caller discipline and a background refresh could keep an
idle-expired session alive indefinitely. `RefreshToken` now calls
`IsIdleTimedOut(currentPrincipal)` after the missing-claims check and returns `null`
(its existing "cannot refresh" signal) when the session is past the idle window, so the
30-minute idle policy holds regardless of caller order; XML doc updated to state the
invariant. Regression tests `RefreshToken_IdleExpiredPrincipal_ReturnsNull`,
`RefreshToken_ActiveSessionWithinIdleWindow_StillRefreshes`,
`RefreshToken_MissingLastActivityClaim_ReturnsNull`.
### Security-015 — Username is not trimmed before use in the LDAP filter, fallback DN, and JWT claims
| | |
|--|--|
| Severity | Low |
| Category | Correctness & logic bugs |
| Status | Resolved |
| Location | `src/ScadaLink.Security/LdapAuthService.cs:20-21`, `:80`, `:122`, `:169`, `:193` |
**Description**
`AuthenticateAsync` rejects null/whitespace-only usernames via `IsNullOrWhiteSpace`, but
an otherwise-valid username with leading or trailing whitespace (`" alice"`, copy-paste
artefacts, mobile keyboards) is passed through verbatim. It flows into the LDAP search
filter `({attr}={EscapeLdapFilter(username)})`, into the fallback bind DN via
`BuildFallbackUserDn`, and — most consequentially — into the returned `LdapAuthResult`
and from there into the JWT `Username` claim. Most LDAP directories ignore surrounding
whitespace on a search term, so `" alice"` and `"alice"` may both authenticate but yield
JWT `Username` claims that differ by whitespace. Any downstream identity comparison
(audit-log `who`, site-scope lookups keyed by username, session de-duplication) then
sees two distinct identities for the same person.
**Recommendation**
Trim the username once at the top of `AuthenticateAsync` (after the `IsNullOrWhiteSpace`
guard) and use the trimmed value consistently for the filter, the DN, and the
`LdapAuthResult`. This normalises the identity that ends up in the JWT claim and audit
trail.
**Resolution**
Resolved 2026-05-17 (commit `pending`). Confirmed: an otherwise-valid username with
leading/trailing whitespace passed the `IsNullOrWhiteSpace` guard and flowed verbatim
into the LDAP filter, the fallback DN, and the returned `LdapAuthResult` (and thus the
JWT `Username` claim). Added a `NormalizeUsername` helper (trims whitespace) called once
at the top of `AuthenticateAsync` immediately after the guard; the local `username`
variable is reassigned to the trimmed value so the filter, fallback DN, and result all
use the single canonical identity. Regression tests
`NormalizeUsername_TrimsLeadingAndTrailingWhitespace`,
`BuildFallbackUserDn_TrimmedUsername_NoLeadingTrailingSpace`,
`AuthenticateAsync_UsernameWithSurroundingWhitespace_StillRejectedForInsecure`.