Refine Security & Auth: LDAP bind, JWT sessions, idle timeout, failure handling
Replace Windows Integrated Auth with direct LDAP bind (username/password login form). Add JWT-based sessions with HMAC-SHA256 shared key for load balancer compatibility. 15-minute token refresh re-queries LDAP for current group memberships. 30-minute configurable idle timeout. LDAP failure: new logins fail, active sessions continue with current roles until LDAP recovers.
This commit is contained in:
51
docs/plans/2026-03-16-security-auth-refinement-design.md
Normal file
51
docs/plans/2026-03-16-security-auth-refinement-design.md
Normal file
@@ -0,0 +1,51 @@
|
||||
# Security & Auth Refinement — Design
|
||||
|
||||
**Date**: 2026-03-16
|
||||
**Component**: Security & Auth (`Component-Security.md`)
|
||||
**Status**: Approved
|
||||
|
||||
## Problem
|
||||
|
||||
The Security & Auth doc defined roles and LDAP mapping but lacked specification for the authentication mechanism (previously stated Kerberos/NTLM, changed to direct LDAP bind), session management, token format, idle timeout, LDAP failure handling, and load balancer compatibility.
|
||||
|
||||
## Decisions
|
||||
|
||||
### Authentication Mechanism
|
||||
- **Direct LDAP bind** with username/password. No Windows Integrated Authentication (Kerberos/NTLM).
|
||||
- User provides credentials in a login form. App validates against LDAP/AD and retrieves group memberships.
|
||||
- No local credential store or caching.
|
||||
|
||||
### Session Management — JWT
|
||||
- **JWT with shared symmetric signing key** (HMAC-SHA256). Both central nodes use the same key from configuration.
|
||||
- **Claims**: user display name, username, roles, permitted site IDs (for site-scoped Deployment). All authorization from token claims — no per-request database lookup.
|
||||
- **Load balancer compatible** — no server-side session state, no sticky sessions needed.
|
||||
|
||||
### Token Lifecycle
|
||||
- **15-minute JWT expiry with sliding refresh**. On refresh, app re-queries LDAP for current group memberships and reissues token with updated claims. Roles never more than 15 minutes stale.
|
||||
- **30-minute idle timeout** (configurable). If no requests within the idle window, user must re-login.
|
||||
- Active users stay logged in indefinitely via sliding refresh.
|
||||
|
||||
### LDAP Failure Handling
|
||||
- **Fail closed for new logins** — can't authenticate without LDAP.
|
||||
- **Grace period for active sessions** — valid JWTs continue to work with current roles. Token refresh skipped until LDAP recovers. Avoids disrupting active work during brief outages.
|
||||
|
||||
### Signing Key
|
||||
- **Shared symmetric key** (HMAC-SHA256) in configuration. Both nodes are trusted issuers. Asymmetric keys rejected as unnecessary complexity for a two-node trusted cluster.
|
||||
|
||||
## Affected Documents
|
||||
|
||||
| Document | Change |
|
||||
|----------|--------|
|
||||
| `Component-Security.md` | Replaced Windows Integrated Auth with direct LDAP bind. Added Session Management, Token Lifecycle, Load Balancer Compatibility, and LDAP Connection Failure sections. |
|
||||
| `HighLevelReqs.md` | Updated authentication description (Section 9.1) to reflect username/password with JWT. |
|
||||
|
||||
## Alternatives Considered
|
||||
|
||||
- **Windows Integrated Authentication (Kerberos/NTLM)**: Rejected by user — app authenticates directly against LDAP/AD.
|
||||
- **Server-side sessions with cookie**: Rejected — doesn't work with load balancer without sticky sessions or shared session store.
|
||||
- **Asymmetric JWT signing (RSA/ECDSA)**: Rejected — both nodes are trusted issuers, no third-party validation needed.
|
||||
- **Two-token pattern (access + refresh)**: Rejected — sliding single JWT with short expiry is simpler and achieves the same goal.
|
||||
- **No idle timeout (rely on JWT expiry)**: Rejected — user wanted explicit idle timeout separate from token refresh cycle.
|
||||
- **Fail closed for active sessions on LDAP outage**: Rejected — would disrupt engineers mid-deployment during brief LDAP outages.
|
||||
- **Credentials cached for LDAP outage resilience**: Rejected — adds local credential store complexity; correct behavior is to deny new logins when identity can't be verified.
|
||||
- **Per-request role lookup from database**: Rejected — unnecessary DB query on every request when roles refresh every 15 minutes via LDAP.
|
||||
Reference in New Issue
Block a user