Solution + 23 src projects + 26 test projects renamed; folders, csproj, namespaces, and ScadaLinkDbContext/ScadaBridgeDbContext class updated. ActorSystem "scadalink" → "scadabridge", Akka seed-node URLs migrated. SQL roles/logins, LDAP domains, CLI command name, and CLI config dir (~/.scadalink → ~/.scadabridge) also renamed. Build green; 5 Host.Tests fail awaiting SQL login rename in next commit. Pre-existing StaleTagMonitor timing flakes unchanged. Rename script committed at tools/rename-to-scadabridge.sh.
58 KiB
Code Review — Security
| Field | Value |
|---|---|
| Module | src/ZB.MOM.WW.ScadaBridge.Security |
| Design doc | docs/requirements/Component-Security.md |
| Status | Reviewed |
| Last reviewed | 2026-05-28 |
| Reviewer | claude-agent |
| Commit reviewed | 1eb6e97 |
| Open findings | 0 (1 deferred — Security-008) |
Summary
The Security module is small and reasonably structured: a stateless LdapAuthService
for search-then-bind authentication, a JwtTokenService for HMAC-signed cookie tokens,
a RoleMapper that resolves LDAP groups to roles, and ASP.NET Core authorization
policies plus a site-scope handler. Unit-test coverage of the happy paths is decent.
However, the review surfaced several real security weaknesses, the most serious being
that StartTLS is dead code (the design's "LDAPS or StartTLS" requirement is only
half met), that the authentication cookie is not marked Secure despite the design
mandating it, and that the JWT signing key is never length-validated so a weak or
empty key is silently accepted. There is also a genuine DN-injection gap in the
no-service-account fallback path, a filter/DN attribute mismatch (uid= vs cn=) that
makes that fallback path internally inconsistent, and an N+1 query in RoleMapper.
JWT validation also disables issuer/audience checks and the idle-timeout claim is reset
on every refresh, weakening the documented 30-minute idle policy. None of these are
crash/data-loss bugs, but the TLS, cookie, and key-validation items are security
defects that should be fixed before any production deployment.
Re-review 2026-05-17 (commit 39d737e)
Re-reviewed the module after the batch-resolution of Security-001..007, 009, 010, 011.
Those fixes were verified in place and remain Resolved: the LdapTransport enum makes
StartTLS reachable, the auth cookie is SecurePolicy.Always with a sliding window, the
JWT signing key is length-validated at construction, issuer/audience are bound and
checked, the DN-injection fallback is RFC 4514-escaped, and the idle-timeout anchor is
carried across refreshes. Security-008 (N+1 scope-rule query) remains correctly
Deferred — ISecurityRepository still exposes no per-set scope-rule query, so the
N+1 cannot be removed from within RoleMapper.cs alone; the deferral condition is
unchanged. This pass surfaced 4 new findings (Security-012..015): two Medium and
two Low. The most notable is that a partial LDAP outage during login (bind succeeds,
group search fails) silently returns an authenticated session with zero roles
instead of failing the login as the design's LDAP-failure rule requires
(Security-012); and that RefreshToken re-issues a token without ever checking
IsIdleTimedOut, so an idle-expired session can be renewed indefinitely if the caller
omits the separate idle check (Security-014). The two Low findings concern fragile
DN parsing of group names containing escaped commas and an un-trimmed username flowing
into the LDAP filter, fallback DN, and JWT claims.
Re-review 2026-05-28 (commit 1eb6e97)
Re-reviewed the module on a fresh baseline. All Security-001..007, 009..015 fixes remain
in place; the only Open carry-over is Security-008 (still correctly Deferred —
ISecurityRepository still exposes no per-set scope-rule query, so the N+1 in
RoleMapper cannot be removed from within this module). The original
Security-014 fix is now load-bearing: RefreshToken calls IsIdleTimedOut before
re-issuing, and the new cookie sliding-expiry tests in SecurityReviewRegressionTests
pin CentralUI-005's Security-side contract. This pass surfaced 6 new findings
(Security-016..021): one Medium correctness/security defect, one Medium design-adherence
defect, and four Low. The most consequential is Security-016 — when a user is
mapped to both a system-wide Deployment LDAP group (e.g. SCADA-Deploy-All) and a
site-scoped Deployment LDAP group (e.g. SCADA-Deploy-SiteA), RoleMapper silently
treats the union as site-scoped (the system-wide grant is dropped); the design's
"multiple groups grant multiple independent roles" intent is not honoured for this
mix-and-match case. Security-017 is the cross-module partner of CentralUI-028:
SiteScopeRequirement / SiteScopeAuthorizationHandler are declared and registered
but no production caller ever instantiates them — [Authorize(Policy = RequireDeployment)]
does not enforce the documented site scoping, callers must remember to inject
SiteScopeService and re-check IsSiteAllowedAsync themselves (which the two new
report pages flagged by CentralUI-028 forgot to do). The remaining Lows are: role names
are magic strings duplicated across RoleMapper, SiteScopeAuthorizationHandler, and
AuthorizationPolicies (Security-018); a service-account-rebind failure is reported
to the user as "Invalid username or password" — masking a misconfiguration as a
user-credential error (Security-019); required SecurityOptions fields
(LdapServer, LdapSearchBase) have no IValidateOptions startup check, so empty
values silently surface only on first login (Security-020); and the
RequireHttpsCookie=false dev opt-out emits no warning, so an HTTP production
deployment silently transmits the JWT bearer credential in cleartext (Security-021).
Checklist coverage
| # | Category | Examined | Notes |
|---|---|---|---|
| 1 | Correctness & logic bugs | ☑ | uid=/cn= attribute mismatch between search filter and fallback DN construction (Security-004); StartTLS branch is unreachable (Security-001). |
| 2 | Akka.NET conventions | ☑ | No actors in this module — AddSecurityActors is an empty placeholder. Nothing to assess. |
| 3 | Concurrency & thread safety | ☑ | Services are stateless and DI-scoped; LDAP sync calls wrapped in Task.Run. No shared mutable state. No issues found. |
| 4 | Error handling & resilience | ☑ | LDAP failure paths return structured LdapAuthResult; group-lookup failure is tolerated per design. ct not honored inside Task.Run bodies (Security-009). |
| 5 | Security | ☑ | StartTLS dead code (Security-001), cookie not Secure (Security-002), JWT key unvalidated (Security-003), DN injection (Security-005), no issuer/audience validation (Security-006), idle-timeout reset on refresh (Security-007). |
| 6 | Performance & resource management | ☑ | N+1 scope-rule query in RoleMapper (Security-008). LdapConnection correctly disposed via using. |
| 7 | Design-document adherence | ☑ | StartTLS unsupported and Secure cookie missing both contradict the design doc; design also says "Windows Integrated Authentication" in Responsibilities, contradicting its own Authentication section (Security-010). |
| 8 | Code organization & conventions | ☑ | SecurityOptions correctly owned by the component; repository interface in Commons. No issues found. |
| 9 | Testing coverage | ☑ | No tests for RoleMapper N+1 behavior, DN-injection inputs, StartTLS path, or idle-timeout-after-refresh. Insecure-config combinations under-tested (Security-011). |
| 10 | Documentation & comments | ☑ | SecurityOptions XML docs say direct bind uses cn={username} while the search filter uses uid= — comment is misleading (covered under Security-004). |
Re-review (2026-05-28, 1eb6e97):
| # | Category | Examined | Notes |
|---|---|---|---|
| 1 | Correctness & logic bugs | ☑ | RoleMapper drops a system-wide Deployment grant when the user is also in any site-scoped Deployment group (Security-016); hard-coded role-name string "Deployment" in two separate places allows a refactor to silently break site scoping (Security-018). |
| 2 | Akka.NET conventions | ☑ | No actors. AddSecurityActors is still a registration placeholder. No issues. |
| 3 | Concurrency & thread safety | ☑ | Services stateless; LDAP sync calls wrapped in Task.Run with the now-bounded timeout (Security-009 resolution holds). No issues found. |
| 4 | Error handling & resilience | ☑ | A service-account-rebind failure inside AuthenticateAsync is reported as "Invalid username or password", masking a misconfiguration as a user-credential error (Security-019). LDAP-failure rule + partial-outage path remain correctly enforced post-Security-012. |
| 5 | Security | ☑ | SiteScopeRequirement / SiteScopeAuthorizationHandler are dead code — no policy is registered that uses them and no production caller instantiates them, so declarative [Authorize] does not enforce site scoping (Security-017, cross-module partner of CentralUI-028). RequireHttpsCookie=false dev opt-out has no warning path — a production misconfiguration silently transmits the JWT bearer credential over HTTP (Security-021). |
| 6 | Performance & resource management | ☑ | Security-008 N+1 remains correctly Deferred (still gated on ISecurityRepository). No new perf issues. |
| 7 | Design-document adherence | ☑ | RoleMapper's drop-system-wide-on-any-scoped behaviour (Security-016) contradicts the design's "A user can hold multiple roles simultaneously … roles are independent — there is no implied hierarchy" rule for the union case; SiteScopeRequirement advertises a site-scope authorization pattern the implementation does not actually wire up (Security-017). |
| 8 | Code organization & conventions | ☑ | Role-name strings are duplicated as magic literals across RoleMapper.cs, SiteScopeAuthorizationHandler.cs, and AuthorizationPolicies.cs — only the audit roles have a single source of truth via OperationalAuditRoles / AuditExportRoles (Security-018). SecurityOptions defaults pass through to runtime with no IValidateOptions for required fields like LdapServer / LdapSearchBase (Security-020). |
| 9 | Testing coverage | ☑ | No test covers a user mapped to both a system-wide AND a site-scoped Deployment LDAP group (the Security-016 case). No test covers the SiteScopeRequirement cross-page integration — tests evaluate the handler in isolation, not the absence of a policy that uses it (Security-017). |
| 10 | Documentation & comments | ☑ | SiteScopeAuthorizationHandler XML doc describes a permission model no caller actually invokes (Security-017). Otherwise stable. |
Findings
Security-001 — StartTLS upgrade path is unreachable dead code
| Severity | High |
| Category | Security |
| Status | Resolved |
| Location | src/ZB.MOM.WW.ScadaBridge.Security/LdapAuthService.cs:37-47 |
Description
When LdapUseTls is true the code sets connection.SecureSocketLayer = true (LDAPS).
The subsequent StartTLS block is guarded by if (_options.LdapUseTls && !connection.SecureSocketLayer).
Because SecureSocketLayer was just set to true, the second condition !connection.SecureSocketLayer
is always false, so connection.StartTls() is never called. The design doc explicitly
states LDAP connections must use "LDAPS (port 636) or StartTLS" — StartTLS is in
practice unsupported. A deployment that intends to use StartTLS on port 389 would get a
plaintext LDAPS-mode connection attempt that fails, or worse, an operator may disable
TLS entirely to make it work, sending credentials in cleartext.
Recommendation
Introduce an explicit transport mode (e.g. LdapTransport { Ldaps, StartTls, None })
or a separate LdapUseStartTls flag. For StartTLS, leave SecureSocketLayer false,
call connection.Connect, then call connection.StartTls() and verify the negotiated
session is encrypted before binding. Remove the unreachable conditional.
Resolution
Resolved 2026-05-16 (commit <pending>). Added an explicit LdapTransport enum
(Ldaps/StartTls/None); SecureSocketLayer is set only for LDAPS, and the
StartTLS branch now connects in plaintext, calls StartTls(), and verifies
connection.Tls before binding. LdapUseTls is retained as a compatibility shim
mapping onto the enum. Regression tests AuthenticateAsync_StartTlsTransport_AttemptsConnection
and AuthenticateAsync_NoTlsTransport_RejectedWithoutAllowInsecure.
Security-002 — Authentication cookie is not marked Secure
| Severity | High |
| Category | Security |
| Status | Resolved |
| Location | src/ZB.MOM.WW.ScadaBridge.Security/ServiceCollectionExtensions.cs:16-23 |
Description
AddCookie sets HttpOnly = true and SameSite = Strict but never sets
options.Cookie.SecurePolicy. The ASP.NET Core default is CookieSecurePolicy.SameAsRequest,
which permits the cookie (carrying the embedded JWT — a bearer credential) to be sent
over plain HTTP. The design doc states the cookie is "HttpOnly and Secure (requires
HTTPS)". As written, the module does not enforce that requirement; a misconfigured or
HTTP-fronted deployment would transmit the session token in cleartext.
Recommendation
Set options.Cookie.SecurePolicy = CookieSecurePolicy.Always in AddCookie. Consider
also setting ExpireTimeSpan and SlidingExpiration to align the cookie lifetime with
the documented 15-minute JWT / 30-minute idle policy.
Resolution
Resolved 2026-05-16 (commit <pending>). Confirmed the cookie is configured in this
module (ServiceCollectionExtensions.AddSecurity), not CentralUI — no misattribution.
Added options.Cookie.SecurePolicy = CookieSecurePolicy.Always so the JWT-bearing
cookie is never sent over plain HTTP. Regression test
AddSecurity_AuthCookie_IsMarkedSecureAlways. (ExpireTimeSpan/SlidingExpiration
tuning left as a separate, lower-priority improvement.)
Security-003 — JWT signing key length is never validated
| Severity | High |
| Category | Security |
| Status | Resolved |
| Location | src/ZB.MOM.WW.ScadaBridge.Security/JwtTokenService.cs:33, src/ZB.MOM.WW.ScadaBridge.Security/SecurityOptions.cs:42 |
Description
SecurityOptions.JwtSigningKey defaults to string.Empty and is fed directly into
new SymmetricSecurityKey(Encoding.UTF8.GetBytes(_options.JwtSigningKey)) with no
validation. HMAC-SHA256 requires a key of at least 256 bits (32 bytes); a short or empty
key produces a trivially forgeable token. The SecurityHardeningTests comment claims a
minimum length is "enforced", but no code in this module enforces it — the test only
asserts that a 32+ char key works. A deployment with a missing or short JwtSigningKey
would start successfully and issue weakly-signed tokens.
Recommendation
Validate JwtSigningKey at startup — fail fast if it is empty or shorter than 32 bytes.
Use an IValidateOptions<SecurityOptions> validator or guard in the JwtTokenService
constructor so a weak key is rejected before any token is issued.
Resolution
Resolved 2026-05-16 (commit <pending>). The JwtTokenService constructor now
fails fast with an InvalidOperationException when JwtSigningKey is empty or
shorter than 32 bytes (SecurityOptions.MinJwtSigningKeyBytes), so a weak key is
rejected before any token is issued. The misleading SecurityOptions XML doc was
corrected to state the requirement. Regression tests
JwtTokenService_EmptySigningKey_ThrowsAtConstruction,
JwtTokenService_ShortSigningKey_ThrowsAtConstruction, and
JwtTokenService_AdequateSigningKey_ConstructsSuccessfully.
Security-004 — Search filter uses uid= while fallback DN construction uses cn=
| Severity | Medium |
| Category | Correctness & logic bugs |
| Status | Resolved |
| Location | src/ZB.MOM.WW.ScadaBridge.Security/LdapAuthService.cs:66, :138, :157-159 |
Description
AuthenticateAsync and ResolveUserDnAsync build the search filter as
(uid={username}), but the no-service-account fallback in ResolveUserDnAsync
constructs the bind DN as cn={username},{LdapSearchBase}. The SecurityOptions.LdapServiceAccountDn
XML comment also documents the fallback as cn={username},{LdapSearchBase}. A directory
keyed on uid will succeed via search-then-bind but fail via the direct-bind fallback
(and vice versa). The attribute used for lookup is hard-coded and inconsistent across
the two code paths, so the two configuration modes are not interchangeable.
Recommendation
Introduce a single configurable LdapUserIdAttribute (default uid) and use it
consistently in both the search filter and the fallback DN. Update the XML doc to match.
Resolution
Resolved 2026-05-16 (commit pending). Confirmed: the search filter was hard-coded
(uid={username}) (both in AuthenticateAsync and ResolveUserDnAsync) while the
fallback DN used cn={username} — the two auth modes were not interchangeable. Added
a configurable SecurityOptions.LdapUserIdAttribute (default uid) used for both the
search filter and the fallback DN via the new BuildFallbackUserDn helper, and
corrected the LdapServiceAccountDn XML doc to reference {LdapUserIdAttribute}.
Regression tests BuildFallbackUserDn_UsesConfiguredUserIdAttribute,
BuildFallbackUserDn_HonoursNonDefaultUserIdAttribute,
SecurityOptions_LdapUserIdAttribute_DefaultsToUid.
Security-005 — DN injection in the no-service-account bind fallback
| Severity | Medium |
| Category | Security |
| Status | Resolved |
| Location | src/ZB.MOM.WW.ScadaBridge.Security/LdapAuthService.cs:157-159 |
Description
When no service account is configured, the user-supplied username is interpolated
directly into a distinguished name: $"cn={username},{LdapSearchBase}". EscapeLdapFilter
escapes search-filter metacharacters, but DN construction requires a different
escaping scheme (RFC 4514 — ,, +, ", \, <, >, ;, leading/trailing spaces).
No DN escaping is applied here. A username such as victim,ou=admins alters the DN
structure, allowing a caller to attempt a bind as a different DN than intended. Combined
with the username.Contains('=') shortcut at line 129 — which lets a caller supply a
full arbitrary DN — the fallback path gives the client undue control over the bind
identity.
Recommendation
Apply RFC 4514 DN-component escaping to username before interpolation, or use the
LDAP library's DN-builder API. Reconsider the Contains('=') shortcut — accepting a
raw DN from untrusted input is risky; restrict it or remove it.
Resolution
Resolved 2026-05-16 (commit pending). Confirmed: the fallback path interpolated
the raw username straight into cn={username},{LdapSearchBase} with no DN escaping,
and the username.Contains('=') shortcut let a caller supply an arbitrary bind DN.
Added an RFC 4514 EscapeLdapDn helper (escapes , + " \ < > ;, leading/trailing
space, leading #, NUL) applied in BuildFallbackUserDn, so a username such as
victim,ou=admins can no longer alter the DN structure. The Contains('=') raw-DN
shortcut was removed entirely — untrusted input no longer controls the bind identity.
Regression tests BuildFallbackUserDn_EscapesDnMetacharacters,
EscapeLdapDn_EscapesAllRfc4514Specials, EscapeLdapDn_EscapesLeadingAndTrailingSpaces.
Security-006 — JWT validation disables issuer and audience checks
| Severity | Medium |
| Category | Security |
| Status | Resolved |
| Location | src/ZB.MOM.WW.ScadaBridge.Security/JwtTokenService.cs:67-75, :56-59 |
Description
ValidateToken sets ValidateIssuer = false and ValidateAudience = false, and
GenerateToken never sets an iss or aud. With a shared symmetric HMAC key, any
other system or component that signs JWTs with the same key would produce tokens this
service accepts. While the design states the key is shared only between the two central
nodes, omitting issuer/audience binding removes a cheap defense-in-depth control and
makes accidental key reuse (e.g. the same secret used for another internal token)
silently exploitable.
Recommendation
Set a fixed Issuer and Audience (e.g. "scadabridge-central") when generating tokens
and enable ValidateIssuer/ValidateAudience with the matching expected values during
validation.
Resolution
Resolved 2026-05-16 (commit pending). Confirmed: GenerateToken set neither iss
nor aud and ValidateToken had ValidateIssuer = false/ValidateAudience = false.
GenerateToken now binds JwtTokenService.TokenIssuer/TokenAudience
(both "scadabridge-central") into every token, and ValidateToken enables
ValidateIssuer/ValidateAudience against those fixed values — a token signed with
the shared key but a foreign issuer is now rejected. Regression tests
GenerateToken_SetsIssuerAndAudience, ValidateToken_RejectsTokenWithWrongIssuer,
ValidateToken_AcceptsOwnIssuerAndAudience.
Security-007 — Idle-timeout claim is reset on every token refresh
| Severity | Medium |
| Category | Correctness & logic bugs |
| Status | Resolved |
| Location | src/ZB.MOM.WW.ScadaBridge.Security/JwtTokenService.cs:40, :111-123 |
Description
The design states the 30-minute idle timeout is tracked via a "last-activity timestamp
in the token", and IsIdleTimedOut reads the LastActivity claim. But RefreshToken
calls GenerateToken, which unconditionally writes LastActivity = DateTimeOffset.UtcNow.
Token refresh fires whenever a request arrives within ~5 minutes of expiry. The result
is that LastActivity reflects token issuance time, not genuine user activity — and
since refresh itself is a request, the timestamp keeps moving forward. A more subtle
consequence: the idle window is effectively measured from the last refresh, not the
last real interaction, so the documented "no requests within the idle window" semantics
are not faithfully implemented. The claim name LastActivity is also misleading.
Recommendation
Decide explicitly how activity is tracked. Either (a) carry the original LastActivity
forward across refreshes and update it only on real request handling in the middleware,
or (b) rename the claim to IssuedAt/TokenCreated and document that the idle window
is measured from issuance. Whichever is chosen, ensure IsIdleTimedOut and the refresh
path agree on the semantics.
Resolution
Resolved 2026-05-16 (commit pending). Confirmed: RefreshToken → GenerateToken
unconditionally wrote LastActivity = UtcNow, so the idle clock reset on every
refresh and the documented 30-minute idle timeout could never fire for a client that
polls in the background. Implemented option (a) — the Security-side half of the
documented "15-min sliding + 30-min idle" policy (the cross-module partner of
CentralUI-005): GenerateToken now takes an optional lastActivity anchor;
RefreshToken carries the existing LastActivity claim forward unchanged, and a
new explicit RecordActivity method advances the anchor to now — to be called by the
CentralUI request pipeline on genuine user interaction (not on a background refresh).
IsIdleTimedOut is unchanged and now agrees with the refresh path. The remaining
CentralUI-side wiring (calling RecordActivity from the middleware, setting
SlidingExpiration/ExpireTimeSpan) stays tracked under CentralUI-005; this finding's
Security-side defect — the reset-on-refresh bug — is fully fixed here. Regression tests
RefreshToken_PreservesOriginalLastActivityClaim,
RefreshToken_DoesNotResetIdleTimeoutWhenUserIsActuallyIdle,
RecordActivity_UpdatesLastActivityToNow.
Security-008 — N+1 query loading site-scope rules in RoleMapper
| Severity | Low |
| Category | Performance & resource management |
| Status | Deferred |
| Location | src/ZB.MOM.WW.ScadaBridge.Security/RoleMapper.cs:25-48 |
Description
MapGroupsToRolesAsync first calls GetAllMappingsAsync, then inside the per-mapping
loop calls GetScopeRulesForMappingAsync(mapping.Id, ct) once for every matched
Deployment mapping. This is an N+1 query pattern executed on the login hot path and on
every 15-minute token refresh. With multiple site-scoped Deployment groups it issues a
round-trip per group.
Recommendation
Add a repository method that loads scope rules for a set of mapping IDs in one query (or eager-loads them with the mappings), and resolve all scope rules with a single call.
Resolution
Deferred 2026-05-16 (commit pending commit). Finding confirmed accurate: RoleMapper.MapGroupsToRolesAsync
issues one GetScopeRulesForMappingAsync round-trip per matched Deployment mapping — a
genuine N+1 on the login / 15-minute-refresh path. However, the only correct fix
(a batch GetScopeRulesForMappingsAsync(IEnumerable<int>) repository method, or an
eager-load navigation property) requires changes to ISecurityRepository
(src/ZB.MOM.WW.ScadaBridge.Commons) and SecurityRepository (src/ZB.MOM.WW.ScadaBridge.ConfigurationDatabase).
Both are outside the Security module's permitted edit scope for this review pass, and the
existing ISecurityRepository surface offers no per-set scope-rule query, so the N+1
cannot be removed from within RoleMapper.cs alone. Severity is Low (bounded by the
number of site-scoped Deployment groups, typically small). Deferred to a cross-module
change that adds the batch repository method; RoleMapper should then resolve all scope
rules in a single call.
Security-009 — CancellationToken not honored inside Task.Run LDAP calls
| Severity | Low |
| Category | Error handling & resilience |
| Status | Resolved |
| Location | src/ZB.MOM.WW.ScadaBridge.Security/LdapAuthService.cs:42, :46, :51, :56-57, :67-73, :135, :139-145 |
Description
The synchronous Novell LDAP calls are wrapped in Task.Run(() => ..., ct). The ct
argument only prevents the work item from starting if cancellation is already
signaled; once a connection.Connect/Bind/Search call is in progress it cannot be
cancelled. A cancelled or timed-out login request will continue to occupy a thread-pool
thread and an LDAP connection until the blocking call returns on its own. There is also
no explicit network/operation timeout configured on the LdapConnection.
Recommendation
Configure LdapConnection.ConnectionTimeout and search/operation time limits so a
hung LDAP server cannot pin a thread indefinitely. Document that ct only guards
work-item scheduling, or implement a timeout-with-disconnect fallback.
Resolution
Resolved 2026-05-16 (commit pending commit). Confirmed: the synchronous Novell LDAP
calls were wrapped in Task.Run(..., ct) where ct only prevents the work item from
starting and cannot interrupt an in-progress blocking Connect/Bind/Search, and no
network/operation timeout was configured on the LdapConnection. Added a configurable
SecurityOptions.LdapConnectionTimeoutMs (default 10s) and a new ApplyConnectionTimeout
helper that sets both LdapConnection.ConnectionTimeout (socket connect) and
LdapConstraints.TimeLimit (per-operation limit) before connecting, so a hung LDAP
server can no longer pin a thread-pool thread indefinitely. The ct-only-guards-scheduling
limitation is now documented in the option's XML doc and an inline comment. Regression
tests SecurityOptions_LdapConnectionTimeout_HasSaneDefault and
AuthenticateAsync_UnreachableHost_FailsWithinConfiguredTimeout.
Security-010 — Design doc contradicts itself on Windows Integrated Authentication
| Severity | Low |
| Category | Design-document adherence |
| Status | Resolved |
| Location | docs/requirements/Component-Security.md:13 (vs. :23) |
Description
The Responsibilities section states the component authenticates "using Windows Integrated Authentication", but the Authentication section (line 23) and CLAUDE.md explicitly state "No Windows Integrated Authentication ... authenticates directly against LDAP/AD, not via Kerberos/NTLM" — which is what the code actually does (direct LDAP bind). The Responsibilities line is stale and contradicts both the rest of the doc and the implementation.
Recommendation
Fix Component-Security.md:13 to say "using a direct LDAP/Active Directory bind"
to match the implemented behavior and the rest of the document.
Resolution
Resolved 2026-05-16 (commit pending commit). Confirmed: the Responsibilities bullet at
line 13 said "using Windows Integrated Authentication", directly contradicting the
Authentication section's "No Windows Integrated Authentication ... authenticates directly
against LDAP/AD, not via Kerberos/NTLM", CLAUDE.md, and the implementation (LdapAuthService
performs a direct username/password bind). Documentation-only finding — no regression test
is meaningful. Reworded the Responsibilities bullet to "Authenticate users against
LDAP/Active Directory using a direct LDAP/AD bind (username/password)", matching the rest
of the document and the code.
Security-011 — Missing tests for security-critical paths
| Severity | Low |
| Category | Testing coverage |
| Status | Resolved |
| Location | tests/ZB.MOM.WW.ScadaBridge.Security.Tests/UnitTest1.cs |
Description
The test suite covers happy paths well but omits several security-relevant cases:
no test exercises the StartTLS path (Security-001), the DN-injection / Contains('=')
fallback inputs (Security-005), JWT validation with a too-short or empty signing key
(Security-003), IsIdleTimedOut returning true after a token has been refreshed
(Security-007), or the uid/cn mismatch in the no-service-account path (Security-004).
The integration SecurityHardeningTests only asserts default option values, not
enforcement. The test file is still named UnitTest1.cs.
Recommendation
Add negative/edge-case tests for the items above, particularly key-length rejection,
DN-escaping of hostile usernames, and idle-timeout behavior across a refresh. Rename
UnitTest1.cs to a descriptive name.
Resolution
Resolved 2026-05-16 (commit pending commit). Re-triage: most of the listed gaps were
already closed by the Security-001..007 fixes — the regression classes
SecurityReviewRegressionTests (StartTLS path, JWT empty/short-key rejection) and
SecurityReviewRegressionTests2 (DN-injection / hostile-username escaping, uid/cn
fallback consistency, idle-timeout preserved across refresh) cover them. The finding's
reference to a SecurityHardeningTests class is stale — no such class exists in the
suite. Remaining gaps closed here: renamed the test file UnitTest1.cs →
SecurityTests.cs, and added a new SecurityReviewRegressionTests3 class with the
LDAP-timeout coverage (Security-009) plus extra no-service-account / DN-path edge cases
(BuildFallbackUserDn_NoSearchBase_ReturnsBareRdn, EscapeLdapDn_LeadingHash_IsEscaped,
EscapeLdapDn_NullOrEmpty_ReturnedUnchanged). Full module suite: 54 tests, all green.
Security-012 — Partial LDAP failure during login yields a roleless authenticated session
| Severity | Medium |
| Category | Error handling & resilience |
| Status | Resolved |
| Location | src/ZB.MOM.WW.ScadaBridge.Security/LdapAuthService.cs:78-118 |
Description
In AuthenticateAsync, after the user bind succeeds, the group/attribute Search is
wrapped in a try/catch (LdapException) that logs a warning and continues — the
comment states "Auth succeeded even if attribute lookup failed". The method then returns
new LdapAuthResult(true, displayName, username, groups, null) with an empty
groups list. Because RoleMapper.MapGroupsToRolesAsync derives all roles from that
group list, a login during a partial LDAP outage (bind reachable, search/group server
unavailable, or a transient LdapException mid-search) produces a fully authenticated
session with zero roles — the user is logged in but silently stripped of all
permissions. The design's LDAP Connection Failure rule states new logins must fail
when LDAP is unavailable; this path instead admits the user. The caller cannot
distinguish "user genuinely belongs to no mapped groups" from "the group query failed",
so it cannot make the documented fail-the-login decision. The while loop's inner
catch (LdapException) { break; } (line 107-111) compounds this: a real error mid-enumeration
is indistinguishable from end-of-results and silently truncates the group list.
Recommendation
Treat a failed group/attribute lookup on the initial login as an authentication
failure: return Success = false with a "directory unavailable, try again" message, or
add an explicit flag on LdapAuthResult (e.g. GroupLookupSucceeded) so the caller can
apply the design's fail-the-login rule. The inner while-loop catch should not treat
an arbitrary LdapException as a benign end-of-results sentinel — only the specific
"no more results" condition should break the loop; any other exception should propagate
or be surfaced.
Resolution
Resolved 2026-05-17 (commit pending). Confirmed: the group/attribute Search was
wrapped in a catch (LdapException) that logged a warning and returned
new LdapAuthResult(true, …, groups: []) — a partial LDAP outage produced an
authenticated, zero-role session — and the inner while-loop catch { break; } masked
real mid-enumeration errors as end-of-results. AuthenticateAsync now tracks a
groupLookupSucceeded flag (false when any LdapException is thrown by the search or
Next()); the inner swallowing catch was removed so such errors propagate; the new
BuildAuthResultFromGroupLookup helper returns a FAILED LdapAuthResult
("directory temporarily unavailable, please try again") on lookup failure while still
treating a genuine empty-groups result as a successful login. Regression tests
BuildAuthResultFromGroupLookup_LookupFailed_FailsTheLogin,
BuildAuthResultFromGroupLookup_LookupSucceededNoGroups_IsAuthenticated,
BuildAuthResultFromGroupLookup_LookupSucceededWithGroups_CarriesGroups.
Security-013 — ExtractFirstRdnValue mis-parses group DNs containing escaped commas
| Severity | Low |
| Category | Correctness & logic bugs |
| Status | Resolved |
| Location | src/ZB.MOM.WW.ScadaBridge.Security/LdapAuthService.cs:258-269 |
Description
ExtractFirstRdnValue extracts a group name from a memberOf DN by taking the substring
between the first = and the first ,. RFC 4514 permits a , inside an RDN value when
it is backslash-escaped — e.g. cn=Acme\, Inc Operators,ou=groups,dc=example,dc=com. The
method splits on the escaped comma and returns Acme\ as the group name. RoleMapper
then matches that mangled name verbatim against configured LdapGroupMapping.LdapGroupName
values, so a group whose CN legitimately contains a comma silently fails to map to its
role — the user loses that role with no error. The same naive parsing also does not strip
the trailing backslash escape.
Recommendation
Parse the first RDN with an escape-aware scan: ignore a , preceded by an unescaped
\, and unescape RFC 4514 sequences in the extracted value. Alternatively use the LDAP
library's DN-parsing API (LdapDN / RDN accessors) rather than hand-rolled IndexOf.
Resolution
Resolved 2026-05-17 (commit pending). Confirmed: ExtractFirstRdnValue took the
substring between the first = and the first ,, splitting on an RFC 4514
backslash-escaped comma so a CN like Acme\, Inc Operators became Acme\. Replaced the
naive IndexOf scan with an escape-aware character scan that ignores a backslash-escaped
,, unescapes single-character and \XX hex escape sequences, and only terminates the
value on an unescaped comma — so a comma-containing group CN is returned intact and
matches its configured LdapGroupName. Method was also made public for direct
testing. Regression tests ExtractFirstRdnValue_EscapedComma_KeepsWholeGroupName,
ExtractFirstRdnValue_PlainDn_ReturnsFirstRdnValue,
ExtractFirstRdnValue_SingleRdn_ReturnsValue,
ExtractFirstRdnValue_EscapedSpecials_AreUnescaped.
Security-014 — RefreshToken re-issues a token without checking the idle timeout
| Severity | Medium |
| Category | Security |
| Status | Resolved |
| Location | src/ZB.MOM.WW.ScadaBridge.Security/JwtTokenService.cs:156-169 |
Description
RefreshToken carries the existing LastActivity claim forward (correct, per the
Security-007 fix) and unconditionally issues a fresh token with a new 15-minute expiry.
It never calls IsIdleTimedOut. The documented 30-minute idle policy therefore depends
entirely on the CentralUI request pipeline calling IsIdleTimedOut before calling
RefreshToken — two independent, order-dependent checks with no enforcement that they
are used together. If a caller refreshes without first checking idle state (an easy
mistake, and there is no compiler or API signal preventing it), an idle-expired session
is silently renewed: each refresh resets the JWT expiry while LastActivity keeps
ageing, so the session can be kept alive indefinitely by background refreshes even
though the user has been idle for hours. RefreshToken is the natural place to enforce
the invariant but provides no safety net.
Recommendation
Have RefreshToken itself reject a principal that is already past the idle timeout —
return null (the same "cannot refresh" signal it already uses for missing claims) when
IsIdleTimedOut(currentPrincipal) is true — so the idle policy holds regardless of
caller discipline. Document that an idle-expired principal cannot be refreshed and add a
regression test covering refresh of a 31-minute-idle token.
Resolution
Resolved 2026-05-17 (commit pending). Confirmed: RefreshToken carried LastActivity
forward and minted a fresh 15-minute token without ever calling IsIdleTimedOut, so the
idle policy depended entirely on caller discipline and a background refresh could keep an
idle-expired session alive indefinitely. RefreshToken now calls
IsIdleTimedOut(currentPrincipal) after the missing-claims check and returns null
(its existing "cannot refresh" signal) when the session is past the idle window, so the
30-minute idle policy holds regardless of caller order; XML doc updated to state the
invariant. Regression tests RefreshToken_IdleExpiredPrincipal_ReturnsNull,
RefreshToken_ActiveSessionWithinIdleWindow_StillRefreshes,
RefreshToken_MissingLastActivityClaim_ReturnsNull.
Security-015 — Username is not trimmed before use in the LDAP filter, fallback DN, and JWT claims
| Severity | Low |
| Category | Correctness & logic bugs |
| Status | Resolved |
| Location | src/ZB.MOM.WW.ScadaBridge.Security/LdapAuthService.cs:20-21, :80, :122, :169, :193 |
Description
AuthenticateAsync rejects null/whitespace-only usernames via IsNullOrWhiteSpace, but
an otherwise-valid username with leading or trailing whitespace (" alice", copy-paste
artefacts, mobile keyboards) is passed through verbatim. It flows into the LDAP search
filter ({attr}={EscapeLdapFilter(username)}), into the fallback bind DN via
BuildFallbackUserDn, and — most consequentially — into the returned LdapAuthResult
and from there into the JWT Username claim. Most LDAP directories ignore surrounding
whitespace on a search term, so " alice" and "alice" may both authenticate but yield
JWT Username claims that differ by whitespace. Any downstream identity comparison
(audit-log who, site-scope lookups keyed by username, session de-duplication) then
sees two distinct identities for the same person.
Recommendation
Trim the username once at the top of AuthenticateAsync (after the IsNullOrWhiteSpace
guard) and use the trimmed value consistently for the filter, the DN, and the
LdapAuthResult. This normalises the identity that ends up in the JWT claim and audit
trail.
Resolution
Resolved 2026-05-17 (commit pending). Confirmed: an otherwise-valid username with
leading/trailing whitespace passed the IsNullOrWhiteSpace guard and flowed verbatim
into the LDAP filter, the fallback DN, and the returned LdapAuthResult (and thus the
JWT Username claim). Added a NormalizeUsername helper (trims whitespace) called once
at the top of AuthenticateAsync immediately after the guard; the local username
variable is reassigned to the trimmed value so the filter, fallback DN, and result all
use the single canonical identity. Regression tests
NormalizeUsername_TrimsLeadingAndTrailingWhitespace,
BuildFallbackUserDn_TrimmedUsername_NoLeadingTrailingSpace,
AuthenticateAsync_UsernameWithSurroundingWhitespace_StillRejectedForInsecure.
Security-016 — RoleMapper silently drops the system-wide Deployment grant when a user is also in any site-scoped Deployment group
| Severity | Medium |
| Category | Correctness & logic bugs |
| Status | Resolved |
| Location | src/ZB.MOM.WW.ScadaBridge.Security/RoleMapper.cs:30-31, :41-55, :59 |
Description
MapGroupsToRolesAsync resolves the Deployment role's site scope as a single
isSystemWide = hasDeploymentRole && !hasDeploymentWithScopeRules flag computed across
ALL matched Deployment mappings. If a user is a member of both a system-wide Deployment
group (e.g. SCADA-Deploy-All, no scope rules) AND a site-scoped Deployment group
(e.g. SCADA-Deploy-SiteA, one scope rule for Site A), the second mapping sets
hasDeploymentWithScopeRules = true, so the final isSystemWide becomes false and
the returned PermittedSiteIds is just [SiteA]. The system-wide grant from
SCADA-Deploy-All is silently dropped — the user loses access to every other site, even
though one of their LDAP groups was intended to grant them system-wide reach. This
contradicts the design's "A user can hold multiple roles simultaneously … roles are
independent — there is no implied hierarchy" intent: the union of grants should be the
broadest grant in the set, not the narrowest. The mistake is also non-obvious to an
operator: from the Admin → LDAP Mappings page nothing flags that adding a site-scoped
Deployment mapping for a user already in SCADA-Deploy-All removes sites from their
effective grant. The downstream SiteScopeService.IsSystemWideAsync / FilterSitesAsync
faithfully reproduce this narrowing, so the user can no longer see or act on sites
outside [SiteA].
Recommendation
Track the union semantics explicitly: if any matched Deployment mapping has no scope
rules, the user is system-wide regardless of what other mappings have. The simplest
change is to set hasDeploymentWithScopeRules only when the mapping has scope rules
AND another flag hasUnscopedDeploymentMapping is false; then compute
isSystemWide = hasUnscopedDeploymentMapping || (hasDeploymentRole && !hasDeploymentWithScopeRules).
Equivalently: collect per-mapping (hasRules, scopedSiteIds) first, then
isSystemWide = any mapping has hasRules==false, and permittedSiteIds = union of all scopedSiteIds (left empty for system-wide users). Add a regression test
MapGroupsToRoles_UserInBothSystemWideAndScopedDeploymentGroup_IsSystemWide covering
the design's example pair SCADA-Deploy-All + SCADA-Deploy-SiteA.
Resolution
Resolved 2026-05-28 (commit pending). RoleMapper.MapGroupsToRolesAsync now tracks two
independent flags per matched Deployment mapping: hasUnscopedDeploymentMapping (any
matched mapping with no scope rules) and hasScopedDeploymentMapping (any matched
mapping with scope rules). isSystemWide is hasUnscopedDeploymentMapping || (hasDeploymentRole && !hasScopedDeploymentMapping) — so a user in both
SCADA-Deploy-All and SCADA-Deploy-SiteA is now correctly system-wide, with the
accumulated scope ids cleared. Magic string "Deployment" was replaced with the new
Roles.Deployment constant (Security-018). Regression test
MapGroupsToRoles_UserInBothSystemWideAndScopedDeploymentGroup_IsSystemWide
(seeds Site A, attaches a scope rule to the seeded SCADA-Deploy-SiteA mapping, and
asserts a user mapped via both SCADA-Deploy-All and SCADA-Deploy-SiteA resolves
to system-wide with empty PermittedSiteIds) fails on the pre-fix code and passes
after.
Security-017 — SiteScopeRequirement / SiteScopeAuthorizationHandler are dead code from production callers — [Authorize(Policy = RequireDeployment)] does NOT enforce site scoping
| Severity | Medium |
| Category | Design-document adherence |
| Status | Resolved |
| Location | src/ZB.MOM.WW.ScadaBridge.Security/SiteScopeAuthorizationHandler.cs:8-58; src/ZB.MOM.WW.ScadaBridge.Security/AuthorizationPolicies.cs:113-143 |
Description
The module declares SiteScopeRequirement (an IAuthorizationRequirement carrying a
TargetSiteId) and the matching SiteScopeAuthorizationHandler that combines the
Deployment role claim with the SiteId claims to enforce the design's site-scoping
rule. The handler is registered in AddScadaBridgeAuthorization
(services.AddSingleton<IAuthorizationHandler, SiteScopeAuthorizationHandler>()). But
no AddPolicy call ever wires the requirement to a named policy, and a grep across
src/ZB.MOM.WW.ScadaBridge.CentralUI and src/ZB.MOM.WW.ScadaBridge.ManagementService confirms that no
production code ever instantiates new SiteScopeRequirement(...) or calls
AuthorizeAsync(...) with one — the only callers are the unit tests in
SecurityTests.cs:1146,1166,1185,1203. The design + CLAUDE.md state that "Deployment
and Monitoring pages must filter every site/instance list through FilterSitesAsync
and re-check IsSiteAllowedAsync before any cross-site command", and the
CentralUI-028 finding (High, Open) confirms this is exactly the contract two new
report pages forgot — because there is no declarative [Authorize(Policy = ...)]
shortcut, callers must remember to inject SiteScopeService and write the check by
hand, and any new page that forgets is a silent regression with no compile-time or
test-time signal. The module's published surface advertises an authorization-handler
pattern that is, in practice, unwired plumbing.
Recommendation
Either (a) delete SiteScopeRequirement and SiteScopeAuthorizationHandler (and
the dead IAuthorizationHandler registration) and document SiteScopeService as the
sole site-scoping mechanism — this is the smaller change and matches what the codebase
actually does today; or, preferably, (b) finish the wiring: add a RequireSiteScope
policy that uses SiteScopeRequirement and provide a small helper / source generator
or analyzer that flags Deployment-policy-attributed pages without a site-scope check.
Either way, address the cross-module gap: CentralUI-028 stays open until production
pages reliably enforce the rule. If (b) is chosen, a route-parameter-aware
IAuthorizationPolicyProvider is needed so the policy can read the target site id from
the request — that is a meaningful design extension and would need to be planned
alongside the Central UI's existing SiteScopeService usage rather than replacing it
piecemeal.
Resolution
Resolved 2026-05-28 (commit pending) per recommendation option (a): deleted
SiteScopeRequirement and SiteScopeAuthorizationHandler outright, along with the
unwired services.AddSingleton<IAuthorizationHandler, SiteScopeAuthorizationHandler>()
registration in AuthorizationPolicies.AddScadaBridgeAuthorization and the four
isolation tests in SecurityTests.cs. SiteScopeService (CentralUI-002) is now
documented as the sole site-scoping mechanism — the stale "mirrors
SiteScopeAuthorizationHandler" comment in SiteScopeService.ResolveAsync was rewritten
to say so. The cross-module partner CentralUI-028 is fixed in the same batch (the two
new report pages now consume SiteScopeService).
Security-018 — Role names are hard-coded magic strings duplicated across RoleMapper, SiteScopeAuthorizationHandler, and AuthorizationPolicies
| Severity | Low |
| Category | Code organization & conventions |
| Status | Resolved |
| Location | src/ZB.MOM.WW.ScadaBridge.Security/RoleMapper.cs:41; src/ZB.MOM.WW.ScadaBridge.Security/SiteScopeAuthorizationHandler.cs:36; src/ZB.MOM.WW.ScadaBridge.Security/AuthorizationPolicies.cs:118,121,124,95,107 |
Description
The role-name literals "Admin", "Design", "Deployment", "Audit", and
"AuditReadOnly" are duplicated as magic strings across three separate files:
RoleMapper.cs:41 hard-codes "Deployment" to detect the site-scope branch;
SiteScopeAuthorizationHandler.cs:36 independently hard-codes "Deployment" to gate
the handler; and AuthorizationPolicies.cs:118,121,124 hard-code the four role names
as the policy RequireClaim values. Only the audit roles have a single source of truth
(via the OperationalAuditRoles / AuditExportRoles arrays on
AuthorizationPolicies). A future rename or addition of a role that misses any one of
these call sites silently breaks the system: e.g. renaming "Deployment" → "Deployer"
in RoleMapper alone would leave the policy still requiring "Deployment" (logins
get the new role name but the policy never matches), while changing it in the policy
alone would leave RoleMapper failing to populate scope rules for the renamed role.
The bug class is "string drift" — exactly the kind the OperationalAuditRoles constant
was introduced to prevent.
Recommendation
Introduce a public static class Roles { public const string Admin = "Admin"; public const string Design = "Design"; public const string Deployment = "Deployment"; public const string Audit = "Audit"; public const string AuditReadOnly = "AuditReadOnly"; } in the Security
project and replace every magic-string occurrence — including the elements of
OperationalAuditRoles and AuditExportRoles — with the constants. A single rename
will then either succeed everywhere or fail to compile.
Resolution
Resolved 2026-05-28 (commit pending). Added src/ZB.MOM.WW.ScadaBridge.Security/Roles.cs holding
Admin/Design/Deployment/Audit/AuditReadOnly as public const string
fields. Replaced every magic-string occurrence in this module:
RoleMapper.MapGroupsToRolesAsync now compares against Roles.Deployment;
AuthorizationPolicies.AddScadaBridgeAuthorization binds RequireClaim(...) to
Roles.{Admin,Design,Deployment}; OperationalAuditRoles /
AuditExportRoles are now built from Roles.Admin, Roles.Audit, Roles.AuditReadOnly.
SiteScopeAuthorizationHandler.cs was deleted under Security-017, so its
"Deployment" literal is gone with it. A future rename now propagates by a
single edit or fails to compile.
Security-019 — Service-account rebind failure is reported as "Invalid username or password" — masks misconfiguration as a user-credential error
| Severity | Low |
| Category | Error handling & resilience |
| Status | Resolved |
| Location | src/ZB.MOM.WW.ScadaBridge.Security/LdapAuthService.cs:85-89, :147-151 |
Description
After the user's credentials bind successfully, AuthenticateAsync re-binds as the
configured service account to perform the group/attribute search
(connection.Bind(_options.LdapServiceAccountDn, _options.LdapServiceAccountPassword)).
A failure of this second bind — wrong service-account password, deleted/disabled
service-account, locked-out service-account — throws LdapException which is caught by
the broad outer catch (LdapException) and returned as
new LdapAuthResult(false, null, username, null, "Invalid username or password.").
The user sees an "invalid credentials" message for their credentials even though
their bind succeeded and the failure was in the system's own service-account
configuration. Worse, every user attempting to log in sees the same incorrect message
during a service-account outage, which routes operators down the wrong incident path
(reset the user's password) instead of the right one (check the service-account
credentials). The successful user bind itself is also not auditable as a discrete
event because the result is "Invalid username or password" — indistinguishable from a
genuine bad-password attempt.
Recommendation
Wrap the service-account rebind in its own try/catch (LdapException) and surface a
distinct error: log _logger.LogError(ex, "Service-account rebind failed; check LdapServiceAccountDn / LdapServiceAccountPassword configuration") and return
new LdapAuthResult(false, null, username, null, "Authentication service is misconfigured. Contact an administrator.").
Add a regression test that exercises the service-account-bind failure path (a mocked
or seamed LdapConnection.Bind that throws on the second call) and asserts the
distinct error message.
Resolution
Resolved 2026-05-28 (commit pending). Added a new ServiceAccountBindException
(src/ZB.MOM.WW.ScadaBridge.Security/ServiceAccountBindException.cs) — deliberately NOT an
LdapException subtype so it short-circuits the generic LDAP catch — and a private
BindServiceAccountAsync helper on LdapAuthService that wraps both service-account
rebind sites (the post-user-bind group-lookup rebind AND the ResolveUserDnAsync
DN-search rebind). On LdapException, the helper logs Error
("Service-account rebind failed; check LdapServiceAccountDn /
LdapServiceAccountPassword configuration") and rethrows as ServiceAccountBindException,
which the outer AuthenticateAsync catch chain maps to the distinct user-facing message
"Authentication service is misconfigured. Contact an administrator." Service-account
faults no longer surface as "Invalid username or password". Regression test
ServiceAccountBindException_DoesNotInheritLdapException_SoCatchOrderIsCorrect pins the
load-bearing inheritance contract; a full end-to-end auth test that exercises the
service-account-bind failure path is not feasible without an LDAP seam in
LdapAuthService (the LdapConnection is constructed in-method), so the structural test
is the closest meaningful unit-level coverage.
Security-020 — SecurityOptions has no startup validation for required fields (LdapServer, LdapSearchBase)
| Severity | Low |
| Category | Error handling & resilience |
| Status | Resolved |
| Location | src/ZB.MOM.WW.ScadaBridge.Security/SecurityOptions.cs:6-7, :36-37; src/ZB.MOM.WW.ScadaBridge.Security/ServiceCollectionExtensions.cs:13-30 |
Resolution (2026-05-28): added SecurityOptionsValidator
(IValidateOptions<SecurityOptions>) that rejects empty/whitespace
LdapServer and LdapSearchBase with messages naming the full
Security:Field key the operator would edit. AddSecurity registers it via
services.AddOptions<SecurityOptions>().ValidateOnStart() +
TryAddEnumerable(... SecurityOptionsValidator) so a misconfigured
Security section fails fast at boot rather than minutes later on the first
login. JwtSigningKey is deliberately left to JwtTokenService's existing
length-aware constructor guard (Security-003). Regression tests in
SecurityOptionsValidatorTests: valid-options succeed; empty/whitespace
LdapServer and LdapSearchBase each fail with the key-naming message
(theory); both-empty reports both keys; AddSecurity_RegistersSecurityOptionsValidator
pins the DI wiring.
Description
SecurityOptions.JwtSigningKey correctly fails fast at JwtTokenService construction
(Security-003 fix), but the LDAP-side required fields — LdapServer (default
string.Empty) and LdapSearchBase (default string.Empty) — have no equivalent
guard. AddSecurity does not register an IValidateOptions<SecurityOptions>. A
deployment that fails to set LdapServer (a typo in the appsettings.json section name,
a missing environment-variable substitution, a misconfigured Docker compose file)
starts cleanly — the Central UI comes up, the login page loads, and only the first
authentication attempt fails with LdapConnection.Connect("") throwing a low-level
exception that bubbles up as the generic "An unexpected error occurred during
authentication." message. The misconfiguration surfaces minutes or hours into the
deploy, on the first real user login, rather than at startup where it is cheap to
diagnose.
Recommendation
Add an IValidateOptions<SecurityOptions> registered via
services.AddOptions<SecurityOptions>().ValidateOnStart() that fails when
LdapServer is null/whitespace, LdapSearchBase is null/whitespace, or
LdapPort <= 0. Combine with the existing JwtTokenService constructor check so
every required SecurityOptions field is enforced at startup, not at first use.
Security-021 — RequireHttpsCookie=false dev opt-out has no warning path — an HTTP production deployment silently transmits the JWT bearer credential in cleartext
| Severity | Low |
| Category | Security |
| Status | Resolved |
| Location | src/ZB.MOM.WW.ScadaBridge.Security/SecurityOptions.cs:100-108; src/ZB.MOM.WW.ScadaBridge.Security/ServiceCollectionExtensions.cs:54-59 |
Resolution (2026-05-28): Added ILoggerFactory to the cookie-options
Configure callback in AddSecurity so an explicit RequireHttpsCookie=false
opt-out now emits a startup LogWarning ("auth cookie SecurePolicy is
SameAsRequest. The cookie-embedded JWT will be transmitted in cleartext
over plain HTTP. This setting is intended for local dev only — set
SecurityOptions:RequireHttpsCookie=true in production."). Default is still
true, so production deployments unchanged. The "fail startup when
RequireHttpsCookie=false AND ASPNETCORE_ENVIRONMENT=Production" hard-stop
option was not implemented (the dev Docker cluster intentionally runs with
the flag false, and the env-var name varies across deploy mechanisms);
the warning is the right ergonomic floor.
Description
The Security-002 fix added RequireHttpsCookie (default true) so the auth cookie's
SecurePolicy is Always in production. The current Docker dev cluster sets
RequireHttpsCookie=false in both central nodes' appsettings.Central.json, downgrading
to SameAsRequest so the local HTTP cluster works. The downgrade is documented in the
XML doc but is silent at runtime: no log line warns that the cookie carrying the JWT
bearer credential is being sent over an HTTP-only path. A production deployment that
inherits a dev-derived appsettings — or that copy-pastes the docker config and forgets
to flip the flag — transmits the session token in cleartext with no diagnostic signal.
The default is correct; the gap is that the unsafe override has no operational guard.
Recommendation
In the PostConfigure block in AddSecurity, when RequireHttpsCookie == false, log
a single startup warning along the lines of _logger.LogWarning("RequireHttpsCookie is DISABLED — auth cookie SecurePolicy is SameAsRequest. The cookie-embedded JWT will be transmitted over plain HTTP. This setting is intended for local dev only — set SecurityOptions:RequireHttpsCookie=true in production."). Optionally, also fail
startup when RequireHttpsCookie=false AND ASPNETCORE_ENVIRONMENT=Production. Add a
regression test that asserts the warning is emitted when the flag is disabled and not
when it is enabled.