Security-sensitive batch, handled main-thread for careful judgment on
secret-leak and pepper-bypass paths.
Secret leak / pepper bypass:
- CD-016 (pepper bypass): InboundApiRepository's GetApiKeyByValueAsync no
longer hashes the candidate with the unpeppered ApiKeyHasher.Default —
ctor takes a lazy Func<IApiKeyHasher> accessor (lazy so test composition
roots without a pepper still bring up the repository), and the DI
registration wires sp.GetService<IApiKeyHasher>() so the production
peppered hasher matches the stored KeyHash. Regression test asserts
positive (peppered roundtrip) AND negative (Default hasher misses the
same key — proving the lookup uses the injected hasher).
- MgmtSvc-020 (SMTP credential leak): UpdateSmtpConfig/ListSmtpConfigs
now project through SmtpConfigPublicShape so the response payload and
audit-row afterState never carry the Credentials field — only a
HasCredentials bool. The SMTP password / OAuth2 client secret no
longer leaves the Admin-only UpdateSmtpConfig boundary the caller
already supplied it to.
Redaction:
- AuditLog-008 (test-fixture under-redact): new
SafeDefaultAuditPayloadFilter (stateless singleton) does HTTP header
redaction for the always-sensitive defaults (Authorization, X-Api-Key,
Cookie, Set-Cookie). FallbackAuditWriter, CentralAuditWriter, and
AuditLogIngestActor (both ingest paths) default to it instead of null
— composition roots that bypass AddAuditLog can no longer write
unredacted auth headers to the audit store.
- NotifService-025 (over-mask): CredentialRedactor.Scrub now only masks
the last colon-separated component (password / clientSecret) AND only
if it's >= 12 chars (typical password heuristic). Short user names
like "root" no longer become global redaction tokens that eat unrelated
diagnostic text. The full packed string is always masked regardless of
length. 3 new negative tests pin the no-over-mask contract.
Audit-row correctness / fail-loud:
- InboundAPI-025: Program.cs UseWhen predicate now excludes /api/audit,
/api/management, /api/centralui, /api/script-analysis AND requires POST
— the AuditWriteMiddleware no longer emits spurious ApiInbound rows
for audit-log query/export endpoints (write-on-read recursion broken).
- ESG-021: ApplyAuth now logs Warning (not silent) on empty
AuthConfiguration for apikey/basic, unknown AuthType, and malformed
Basic config. AuthConfiguration value NEVER logged. AuthType=none
remains silent (documented unauthenticated sentinel).
- Security-021: AddSecurity now logs a startup Warning when
RequireHttpsCookie=false — an HTTP-only deployment that previously
transmitted the cookie-embedded JWT silently in cleartext is now
audible in the log.
Defensive:
- CD-021: SwitchOutPartitionAsync's monthBoundary format string now
yyyy-MM-dd HH:mm:ss.fffffff (datetime2(7) precision) so a future
sub-second / non-midnight boundary doesn't silently round to the
wrong partition.
Plus reconciled stale per-module Open-findings counters that had drifted
from earlier sessions (AuditLog, CD, ESG, IAPI, MgmtSvc, NotifService,
Security).
Build clean; all affected test projects green (Host 208, ConfigDB 242,
ESG 69, IAPI 151, MgmtSvc 100, NotifService 55, Security 85, AuditLog
247/248 — 1 pre-existing date-sensitive integration test flake on
PartitionPurgeTests, unrelated). README regenerated: 46 open (was 54).
Comm-016: delete dead HandleConnectionStateChanged + _debugSubscriptions /
_inProgressDeployments tracking + ConnectionStateChanged message record.
Disconnect detection is owned by the transport layers (gRPC keepalive PING
~25s; Ask-timeout at CommunicationService). Updates the
Component-Communication.md design doc to make that explicit.
SnF-018: NotificationForwarder.DeliverAsync now discards a corrupt buffered
payload (Warning log + return true) instead of returning false and parking
the row — honoring the design's "notifications do not park" invariant.
DM-018: reconciliation no longer force-sets Enabled, preserving an
intentional Disabled state after central failover.
ESG-018: DeliverBufferedAsync (both ExternalSystemClient + DatabaseGateway)
catches JsonException and returns false, turning a corrupt buffered row
into a parked operation instead of a retry-forever poison message.
InboundAPI-022: register ActiveNodeGate as IActiveNodeGate in the Central
DI branch so standby-node gating is actually wired up in production.
NS-019: remove orphaned NotificationDeliveryService /
INotificationDeliveryService / NotificationResult; central notification
delivery now lives entirely in NotificationOutbox.
SEL-016: normalise From/To filters to UTC before ISO-string compare so
non-UTC DateTimeOffset clients no longer get spuriously excluded events.
TE-017: include Description on attributes/alarms and a HashableConnections
projection (protocol, endpoint JSON, failover count) in the revision hash
and DiffService; staleness detection now catches description-only and
connection-endpoint edits.
Transport-001 and Transport-002 (also High) remain Open — they're being
handled in a follow-up batch because both touch BundleImporter.cs and
must serialise.
NS-021/NO-001: thread FromAddress into XOAUTH2 so M365 stops rejecting
sends with 535 5.7.3. Added an additive oauth2UserName parameter on
ISmtpClientWrapper.AuthenticateAsync; both NotificationService and
NotificationOutbox now pass config.FromAddress.
NO-002: clamp non-positive SmtpConfiguration.MaxRetries/RetryDelay to the
1-min / 10-attempt fallback with a Warning so a misconfigured row no
longer parks transient failures on the first attempt or burn-loops.
NO-003: route a lifecycle-scoped CancellationToken from the
NotificationOutboxActor through the dispatch sweep into the adapter so
in-flight SMTP sends abort on PostStop instead of blocking
CoordinatedShutdown for the full SMTP timeout per row.
NO-004: await the central audit writer inside the existing try/catch
instead of fire-and-forget so the audit task can't outlive the per-sweep
DI scope and writer faults reach the operator log instead of being
silently dropped.
Two AuditLog integration tests seeded RetryDelay = TimeSpan.Zero to force
immediate re-claim on the second tick; updated them to 1 ms so they keep
the same intent without tripping the NO-002 clamp.
FU1 of the Notification Outbox follow-ups. EmailNotificationDeliveryAdapter
carried verbatim private copies of credential redaction, SMTP error
classification, and address validation because the NotificationService
helpers were internal. This eliminates the divergence risk by promoting the
helpers to public and deleting the adapter's copies.
- CredentialRedactor: internal -> public.
- Extract SmtpErrorClassifier + SmtpErrorClass enum into a new public static
class; NotificationDeliveryService now routes classification through it
(behavior unchanged). Adds focused SmtpErrorClassifierTests.
- NotificationDeliveryService.ValidateAddresses: internal -> public; the
adapter calls it directly.
- EmailNotificationDeliveryAdapter: deleted ScrubCredentials, ClassifySmtpError,
SmtpErrorClass, IsTransientSmtpError and ValidateAddresses copies.
No InternalsVisibleTo hack — specific helpers promoted to public. Both test
suites green; full solution builds clean.
Resolves StoreAndForward-001, ExternalSystemGateway-001, NotificationService-001
— one systemic gap where buffered messages were persisted but never delivered,
and the active node never replicated its buffer to the standby.
Delivery handlers (ExternalSystemGateway-001 / NotificationService-001):
- AkkaHostedService registers delivery handlers for the ExternalSystem,
CachedDbWrite and Notification categories after StoreAndForwardService starts;
each resolves its scoped consumer in a fresh DI scope.
- ExternalSystemClient, DatabaseGateway and NotificationDeliveryService each
gain a DeliverBufferedAsync method: re-resolve the target and re-attempt
delivery, returning true/false/throwing per the transient-vs-permanent contract.
- EnqueueAsync gains an attemptImmediateDelivery flag; CachedCallAsync and
NotificationDeliveryService.SendAsync pass false (they already attempted
delivery themselves) so registering a handler does not dispatch twice.
Replication (StoreAndForward-001):
- ReplicationService is injected into StoreAndForwardService; a new BufferAsync
helper replicates every enqueue, and successful-retry removes and parks are
replicated too. Fire-and-forget, no-op when replication is disabled.
Tests: StoreAndForwardReplicationTests (Add/Remove/Park observed),
attemptImmediateDelivery behaviour, and DeliverBufferedAsync paths for each
consumer. Full solution builds; StoreAndForward/ExternalSystemGateway/
NotificationService suites green.
Move all package versions into Directory.Packages.props so every project
resolves a single consistent version. Consolidates the Roslyn packages
(Microsoft.CodeAnalysis.CSharp.Scripting/Workspaces) onto 5.0.0, which
resolves the pre-existing NU1608 version-skew error in the test projects.
- Add JoeAppEngine folder to OPC UA nodes.json (BTCS, AlarmCntsBySeverity, Scheduler/ScanTime)
- Fix DataConnectionActor: capture Self in PreStart for use from non-actor threads,
preventing Self.Tell failure in Disconnected event handler
- Implement InstanceActor.HandleConnectionQualityChanged to mark attributes Bad on disconnect
- Fix LmxFakeProxy TagMapper to serialize arrays as JSON instead of "System.Int32[]"
- Allow DataType and DataSourceReference updates in TemplateService.UpdateAttributeAsync
- Update test_infra_opcua.md with JoeAppEngine documentation
17 source projects (Commons + Host + 15 components) and 17 xUnit test projects.
SLNX format, net10.0, nullable enabled, warnings as errors. All components
reference Commons; Host references all components. Builds and tests clean.