Commit Graph

22 Commits

Author SHA1 Message Date
Joseph Doherty 6ae0fea558 fix(error-handling): close Theme 4 — 18 cancellation / fire-and-forget findings
Async cancellation hygiene, fire-and-forget observability, retry/shutdown
semantics, and audit-row coverage across 9 modules. Highlights:

Cancellation & lifecycle:
- AuditLog-006: SqliteAuditWriter.Dispose hops to thread pool, escaping the
  captured SyncContext that risked sync-over-async deadlock.
- AuditLog-010: SiteAuditTelemetryActor owns a private lifecycle CTS,
  threaded through drain paths instead of CancellationToken.None.
- Comm-019: CentralCommunicationActor adds lifecycle CTS for repo calls.
- Host-019: Migration StartupRetry forwards ApplicationStopping so SIGTERM
  during the bounded-retry window aborts cleanly.

Cursor / retry / counter correctness:
- AuditLog-004: SiteAuditReconciliationActor's cursor now holds at `since`
  when any row's idempotent insert is still being retried (per-EventId
  retry counter, MaxPermanentInsertAttempts=5 escape valve with LogCritical
  abandon). No more silent abandonment of permanently-failing rows.
- ConfigDB-019: Dropped the catch-and-continue on EnsureLookaheadAsync's
  SPLIT loop — by class-doc construction the catch could only mask real
  failures and let the next iteration create permanent partition holes.
- HM-017/018: HealthReportSender + CentralHealthReportLoop snapshot
  per-interval counters before sending, restore via new
  ISiteHealthCollector.AddIntervalCounters on transport failure so counts
  aren't silently lost.

Fire-and-forget / shutdown waits:
- InboundAPI-018: AuditWriteMiddleware observes faulted audit-write tasks
  via OnlyOnFaulted continuation (Warning log; response unchanged).
- SnF-024: StoreAndForwardService.StopAsync awaits in-flight retry sweep
  with a bounded SweepShutdownWaitTimeout (10s).

Leak / refactor:
- Comm-021: SiteStreamGrpcServer.SubscribeInstance wraps Subscribe in its
  own try/catch so a throw doesn't leak the relay actor or _activeStreams
  entry.
- Comm-022: VERIFIED already-closed by Comm-016's dead-code purge.
- CLI-017: BundleCommands' three subcommands delegate to ExecuteCommandAsync
  (auth-failure exit-code contract unified).

Defensive / validation:
- CLI-021: CliConfig.Load wraps file-read/JSON parse so malformed config
  prints a warning and returns defaults instead of crashing the CLI.
- Host-022: ParseLevel emits stderr one-shot warning for unrecognised
  MinimumLevel instead of silently coercing to Information.
- ESG-019: ExternalSystemClient sets HttpClient.Timeout=Infinite so the
  per-call CTS is the sole timeout source (was clipped to 100s by .NET).
- Security-020: New SecurityOptionsValidator (IValidateOptions) rejects
  empty LdapServer/LdapSearchBase with ValidateOnStart.
- DM-019: Lifecycle command timeouts now emit DisableTimedOut/EnableTimedOut/
  DeleteTimedOut audit entries (mirrors DeployFailed pattern).

Plus reconciled stale per-module Open-findings counters that had drifted
from prior sessions.

20+ new regression tests across 11 test projects; build clean; affected
suites all green. README regenerated: 75 open (was 93).
2026-05-28 07:13:28 -04:00
Joseph Doherty e536178323 fix(security): close auth & site-scoping gaps across 8 findings
Resolves the auth-theme batch from the 2026-05-28 baseline review (8 findings
across Security/CentralUI/ManagementService/CLI). The most consequential gaps:
NotificationReport + SiteCallsReport now route through SiteScopeService so a
site-scoped Deployment user cannot see or act on other sites' rows (CUI-028);
QueryAuditLogCommand is no longer "any authenticated user" — gated Admin-only
to match /api/audit/query's strictness (MS-018); RoleMapper preserves the
broader grant when a user is in both an unscoped and scoped Deployment LDAP
group, instead of silently narrowing to the scoped set (Sec-016); and the
dead SiteScopeRequirement/Handler are deleted so SiteScopeService is
unambiguously the sole site-scoping mechanism (Sec-017). Pending findings:
172 → 164.
2026-05-28 03:35:29 -04:00
Joseph Doherty 592cbd028e feat(audit): ParentExecutionId filter in the CLI and ManagementService 2026-05-21 18:59:06 -04:00
Joseph Doherty 24cdfe373c feat(audit): ExecutionId filter in the CLI and ManagementService 2026-05-21 16:00:09 -04:00
Joseph Doherty f64a7aed02 refactor(audit): consolidate query-param parsers; widen CLI export to multi-value 2026-05-21 05:37:06 -04:00
Joseph Doherty 2a76be1f94 feat(audit): multi-value filters across ManagementService, CLI and Central UI 2026-05-21 05:27:17 -04:00
Joseph Doherty 399b4aac92 feat(cli): notification smtp update --tls-mode / --credentials options
Expose the two previously-unreachable SmtpConfiguration fields on the
CLI. Both flags are optional — omitting them sends null so the server
preserves the existing value. --tls-mode is constrained to the canonical
{None, StartTLS, SSL} set via AcceptOnlyFromAmong for fast-fail.
2026-05-21 02:11:51 -04:00
Joseph Doherty ff004e2e48 fix(cli): correct audit query channel/kind/status enum names + drop dead --instance flag (#23 M8) 2026-05-20 22:13:26 -04:00
Joseph Doherty ba8ddcc032 refactor(cli): rename audit-log to audit-config with deprecation alias (#23 M8) 2026-05-20 22:02:19 -04:00
Joseph Doherty d40ee85e14 feat(cli): table output formatter for audit events (#23 M8) 2026-05-20 22:00:57 -04:00
Joseph Doherty 4b3a692170 feat(cli): scadalink audit verify-chain subcommand v1 no-op (#23 M8) 2026-05-20 21:57:16 -04:00
Joseph Doherty 91682cd862 feat(cli): scadalink audit export subcommand (#23 M8) 2026-05-20 21:56:20 -04:00
Joseph Doherty 2fa46ed400 feat(cli): scadalink audit query subcommand (#23 M8) 2026-05-20 21:55:38 -04:00
Joseph Doherty 3263b39477 feat(cli): scaffold scadalink audit command group (#23 M8) 2026-05-20 21:52:37 -04:00
Joseph Doherty f82bcbed7c fix(cli): resolve CLI-014..016 — re-triage update-command contract, doc-surface drift, table-column union 2026-05-17 03:18:16 -04:00
Joseph Doherty b1f4251d75 fix(commons): resolve Commons-008 — replace ValueTuple in SetConnectionBindingsCommand with named ConnectionBinding record (CLI, ManagementService, TemplateEngine, CentralUI) 2026-05-16 23:54:31 -04:00
Joseph Doherty 404216b4ee fix(cli): resolve CLI-008..013 — format validation, exit-code semantics, debug-stream cancellation/disposal, test coverage 2026-05-16 22:04:21 -04:00
Joseph Doherty 738e67acc5 fix(cli): resolve CLI-002..007 — robust response rendering, URL/JSON arg validation, credential env-vars, doc refresh 2026-05-16 20:58:03 -04:00
Joseph Doherty 5a08b04535 fix(cli): resolve CLI-001 — honor SCADALINK_FORMAT and config-file format precedence 2026-05-16 19:33:09 -04:00
Joseph Doherty 9c60592632 build: adopt NuGet Central Package Management
Move all package versions into Directory.Packages.props so every project
resolves a single consistent version. Consolidates the Roslyn packages
(Microsoft.CodeAnalysis.CSharp.Scripting/Workspaces) onto 5.0.0, which
resolves the pre-existing NU1608 version-skew error in the test projects.
2026-05-16 15:56:30 -04:00
Joseph Doherty 1a540f4f0a feat: add HTTP Management API, migrate CLI from Akka ClusterClient to HTTP
Replace the CLI's Akka.NET ClusterClient transport with a simple HTTP client
targeting a new POST /management endpoint on the Central Host. The endpoint
handles Basic Auth, LDAP authentication, role resolution, and ManagementActor
dispatch in a single round-trip — eliminating the CLI's Akka, LDAP, and
Security dependencies.

Also fixes DCL ReSubscribeAll losing subscriptions on repeated reconnect by
deriving the tag list from _subscriptionsByInstance instead of _subscriptionIds.
2026-03-20 23:55:31 -04:00
Joseph Doherty 7740a3bcf9 feat: add JoeAppEngine OPC UA nodes, fix DCL auto-reconnect and quality push
- Add JoeAppEngine folder to OPC UA nodes.json (BTCS, AlarmCntsBySeverity, Scheduler/ScanTime)
- Fix DataConnectionActor: capture Self in PreStart for use from non-actor threads,
  preventing Self.Tell failure in Disconnected event handler
- Implement InstanceActor.HandleConnectionQualityChanged to mark attributes Bad on disconnect
- Fix LmxFakeProxy TagMapper to serialize arrays as JSON instead of "System.Int32[]"
- Allow DataType and DataSourceReference updates in TemplateService.UpdateAttributeAsync
- Update test_infra_opcua.md with JoeAppEngine documentation
2026-03-19 13:27:54 -04:00