Commit Graph

38 Commits

Author SHA1 Message Date
Joseph Doherty
faef2d0de6 Phase 2 WP-1–13+23: Template Engine CRUD, composition, overrides, locking, collision detection, acyclicity
- WP-23: ITemplateEngineRepository full EF Core implementation
- WP-1: Template CRUD with deletion constraints (instances, children, compositions)
- WP-2–4: Attribute, alarm, script definitions with lock flags and override granularity
- WP-5: Shared script CRUD with syntax validation
- WP-6–7: Composition with recursive nesting and canonical naming
- WP-8–11: Override granularity, locking rules, inheritance/composition scope
- WP-12: Naming collision detection on canonical names (recursive)
- WP-13: Graph acyclicity (inheritance + composition cycles)
Core services: TemplateService, SharedScriptService, TemplateResolver,
LockEnforcer, CollisionDetector, CycleDetector. 358 tests pass.
2026-03-16 20:10:34 -04:00
Joseph Doherty
84ad6bb77d Fix LDAP integration test: use GLAuth test credentials and runtime availability check
- Password "admin" → "password" (matches GLAuth config.toml)
- Replace hard Skip attribute with TCP connectivity check (test runs when GLAuth available)
- Add LdapSearchBase + AllowInsecureLdap to appsettings.Central.json for dev
2026-03-16 19:56:05 -04:00
Joseph Doherty
dab8b061b5 Phase 1 complete: execution checklist with all 22 WPs and 20 gate criteria passing 2026-03-16 19:51:49 -04:00
Joseph Doherty
d38356efdb Phase 1 WP-11–22: Host infrastructure, Blazor Server UI, and integration tests
Host infrastructure (WP-11–17):
- StartupValidator with 19 validation rules
- /health/ready endpoint with DB + Akka health checks
- Akka.NET bootstrap via AkkaHostedService (HOCON config, cluster, remoting, SBR)
- Serilog with SiteId/NodeHostname/NodeRole enrichment
- DeadLetterMonitorActor with count tracking
- CoordinatedShutdown wiring (no Environment.Exit)
- Windows Service support (UseWindowsService)

Central UI (WP-18–21):
- Blazor Server shell with Bootstrap 5, role-aware NavMenu
- Login/logout flow (LDAP auth → JWT → HTTP-only cookie)
- CookieAuthenticationStateProvider with idle timeout
- LDAP group mapping CRUD page (Admin role)
- Route guards with Authorize attributes per role
- SignalR reconnection overlay for failover

Integration tests (WP-22):
- Startup validation, auth flow, audit transactions, readiness gating
186 tests pass (1 skipped: LDAP integration), zero warnings.
2026-03-16 19:50:59 -04:00
Joseph Doherty
cafb7d2006 Phase 1 WP-2–10: Repositories, audit service, security & auth (LDAP, JWT, roles, policies, data protection)
- WP-2: SecurityRepository + CentralUiRepository with audit log queries
- WP-3: AuditService with transactional guarantee (same SaveChangesAsync)
- WP-4: Optimistic concurrency tests (deployment records vs template last-write-wins)
- WP-5: Seed data (SCADA-Admins → Admin role mapping)
- WP-6: LdapAuthService (direct bind, TLS enforcement, group query)
- WP-7: JwtTokenService (HMAC-SHA256, 15-min refresh, 30-min idle timeout)
- WP-8: RoleMapper (LDAP groups → roles with site-scoped deployment)
- WP-9: Authorization policies (Admin/Design/Deployment + site scope handler)
- WP-10: Shared Data Protection keys via EF Core
141 tests pass, zero warnings.
2026-03-16 19:32:43 -04:00
Joseph Doherty
1996b21961 Phase 1 WP-1: EF Core DbContext with Fluent API mappings for all 26 entities
ScadaLinkDbContext with 10 configuration classes (Fluent API only), initial
migration creating 25 tables, environment-aware migration helper (auto-apply
dev, validate-only prod), DesignTimeDbContextFactory, optimistic concurrency
on DeploymentRecord. 20 tests verify schema, CRUD, relationships, cascades.
2026-03-16 19:15:50 -04:00
Joseph Doherty
9bc5a5163f Phase 0 complete: update execution checklist with all gates passing
12/12 WPs complete, 57/57 tests passing, 100/100 requirements verified,
13/13 design constraints verified. All verification gate criteria pass.
2026-03-16 19:00:15 -04:00
Joseph Doherty
8c2091dc0a Phase 0 WP-0.10–0.12: Host skeleton, options classes, sample configs, and execution framework
- WP-0.10: Role-based Host startup (Central=WebApplication, Site=generic Host),
  15 component AddXxx() extension methods, MapCentralUI/MapInboundAPI stubs
- WP-0.11: 12 per-component options classes with config binding
- WP-0.12: Sample appsettings for central and site topologies
- Add execution procedure and checklist template to generate_plans.md
- Add phase-0-checklist.md for execution tracking
- Resolve all 21 open questions from plan generation
- Update IDataConnection with batch ops and IAsyncDisposable
57 tests pass, zero warnings.
2026-03-16 18:59:07 -04:00
Joseph Doherty
22e1eba58a Phase 0 WP-0.2–0.9: Implement Commons (types, entities, interfaces, messages, protocol, tests)
- WP-0.2: Namespace/folder skeleton (26 directories)
- WP-0.3: Shared data types (6 enums, RetryPolicy, Result<T>)
- WP-0.4: 24 domain entity POCOs across 10 domain areas
- WP-0.5: 7 repository interfaces with full CRUD signatures
- WP-0.6: IAuditService cross-cutting interface
- WP-0.7: 26 message contract records across 8 concern areas
- WP-0.8: IDataConnection protocol abstraction with batch ops
- WP-0.9: 8 architectural constraint enforcement tests
All 40 tests pass, zero warnings.
2026-03-16 18:48:24 -04:00
Joseph Doherty
fed5f5a82c Add .gitignore and remove tracked build artifacts (bin/obj) 2026-03-16 18:38:00 -04:00
Joseph Doherty
34190e1347 Phase 0 WP-0.1: Create .NET 10 solution structure with all 17 component projects
17 source projects (Commons + Host + 15 components) and 17 xUnit test projects.
SLNX format, net10.0, nullable enabled, warnings as errors. All components
reference Commons; Host references all components. Builds and tests clean.
2026-03-16 18:37:36 -04:00
Joseph Doherty
021817930b Generate all 11 phase implementation plans with bullet-level requirement traceability
All phases (0-8) now have detailed implementation plans with:
- Bullet-level requirement extraction from HighLevelReqs sections
- Design constraint traceability (KDD + Component Design)
- Work packages with acceptance criteria mapped to every requirement
- Split-section ownership verified across phases
- Orphan checks (forward, reverse, negative) all passing
- Codex MCP (gpt-5.4) external verification completed per phase

Total: 7,549 lines across 11 plan documents, ~160 work packages,
~400 requirements traced, ~25 open questions logged for follow-up.
2026-03-16 15:34:54 -04:00
Joseph Doherty
a9fa74d5ac Document LmxProxy protocol in DCL, strengthen plan generation traceability guards, and add UI constraints
- Replace "custom protocol" placeholder with full LmxProxy details (gRPC transport, SDK API mapping, session management, keep-alive, TLS, batch ops)
- Add bullet-level requirement traceability, design constraint traceability (52 KDD + 6 CD), split-section tracking, and post-generation orphan check to plan framework
- Resolve Q9 (LmxProxy), Q11 (REST test server), Q13 (solo dev), Q14 (self-test), Q15 (Machine Data DB out of scope)
- Set Central UI constraints: Blazor Server + Bootstrap only, no heavy frameworks, custom components, clean corporate design
2026-03-16 15:08:57 -04:00
Joseph Doherty
e3a418d603 Add Machine Data seed (tables, stored procedures, sample data) and fix SA password for shell compatibility 2026-03-16 14:41:28 -04:00
Joseph Doherty
0513a104a9 Add Flask REST API test server for External System Gateway and Inbound API testing 2026-03-16 14:28:03 -04:00
Joseph Doherty
40610271d6 Add Mailpit SMTP test server for Notification Service email testing
Adds a fourth Docker service (Mailpit) to capture outgoing emails without
delivery, with CLI tool for sending test emails, listing/reading captured
messages, and clearing the inbox. Supports BCC pattern matching ScadaLink's
notification delivery model.
2026-03-16 14:10:44 -04:00
Joseph Doherty
652378b470 Add test infrastructure with Docker services, CLI tools, and resolve Phase 0 questions
Stand up local dev infrastructure (OPC UA, LDAP, MS SQL) with Docker Compose,
Python CLI tools for service interaction, and teardown script. Fix GLAuth config
mount, OPC PLC node format, and document actual DN/namespace behavior discovered
during testing. Resolve Q1-Q8,Q10: .NET 10, Akka.NET 1.5.x, monorepo with slnx,
appsettings JWT, Windows Server 2022 site target.
2026-03-16 14:03:12 -04:00
Joseph Doherty
7a0bd0f701 Create implementation plan generation framework
generate_plans.md: Master plan defining 10 phases (0, 1, 2, 3A, 3B, 3C, 4-8)
with component assignments, sub-tasks, testable outcomes, and HighLevelReqs
coverage. Phase 3 split into 3A (runtime foundation + failover), 3B (site I/O
+ observability), 3C (deployment pipeline + S&F) per Codex review. Failover
testing embedded in runtime phases, not deferred to hardening.

requirements-traceability.md: Full matrix mapping all 54 HighLevelReqs sections
and 22 REQ-* identifiers to implementation phases. Zero unmapped requirements.

questions.md: 15 open questions requiring follow-up before/during implementation
(tooling, environments, team, integration targets).
2026-03-16 09:59:23 -04:00
Joseph Doherty
760eb38eac Update CLAUDE.md and README.md with all design decisions from refinement
CLAUDE.md: reorganize Key Design Decisions into categorized sections covering
architecture, data, integrations, templates, S&F, security, cluster, UI,
monitoring, code organization, and Akka.NET conventions. Add docs/plans and
AkkaDotNet to project structure.

README.md: add technology stack table and scale summary. Update all 17
component descriptions to reflect refined designs. Update architecture
diagram with load balancer, 2-node annotations, protocol connections, and
component details. Add links to AkkaDotNet reference docs and design plans.
2026-03-16 09:38:15 -04:00
Joseph Doherty
f5b3b2b59e Define per-component configuration binding convention in Host
Expand REQ-HOST-3 with per-component appsettings.json sections, each mapped
to a dedicated options class owned by the component. Convention: components
define their own options class, Host binds via Options pattern, components
read via IOptions<T>. Options classes live in component projects, not Commons.
2026-03-16 09:33:05 -04:00
Joseph Doherty
4ec5d50425 Add namespace and folder convention for Commons shared interfaces and types
Define REQ-COM-5b with full folder hierarchy, namespace rules, and naming
conventions for types, interfaces, entities, and message contracts. Organizes
by category (Types, Interfaces, Entities, Messages) and domain area within
each category.
2026-03-16 09:31:09 -04:00
Joseph Doherty
6d33e93610 Establish UTC as the system-wide timestamp convention
All timestamps must use UTC for storage, transmission, and processing.
Local time conversion is a Central UI display concern only. Documented
in Commons (REQ-COM-1) and HighLevelReqs (Section 13.1).
2026-03-16 09:30:08 -04:00
Joseph Doherty
3a833f5dea Persist static attribute writes to local SQLite at site clusters
Static attribute SetAttribute calls now persist the override to local SQLite,
surviving restart and failover. On Instance Actor startup, persisted overrides
are loaded on top of the deployed configuration. Redeployment resets all
persisted overrides to the new deployed values.
2026-03-16 09:16:29 -04:00
Joseph Doherty
409cc62309 Verify component designs against Akka.NET best practices documentation
Cluster Infrastructure: add min-nr-of-members=1 requirement for single-node
operation after failover. Add graceful shutdown / CoordinatedShutdown section
for fast singleton handover during planned maintenance.

Site Runtime: add explicit supervision strategies per actor type (Resume for
coordinators, Stop for short-lived execution actors). Stagger Instance Actor
startup to prevent reconnection storms. Add Tell-vs-Ask usage guidance per
Akka.NET best practices (Tell for hot path, Ask for system boundaries only).

Data Connection Layer: add Connection Actor Model section documenting the
Become/Stash pattern for connection lifecycle state machine.

Health Monitoring: add dead letter count as a monitored metric.

Host: add REQ-HOST-8a for dead letter monitoring (subscribe to EventStream,
log at Warning level, report as health metric).
2026-03-16 09:12:36 -04:00
Joseph Doherty
de636b908b Add Akka.NET reference documentation
Notes and documentation covering actors, remoting, clustering, persistence,
streams, serialization, hosting, testing, and best practices for the Akka.NET
framework used throughout the ScadaLink system.
2026-03-16 09:08:17 -04:00
Joseph Doherty
34694adba2 Apply Codex review findings across all 17 components
Template Engine: add composed member addressing (path-qualified canonical names),
override granularity per entity type, semantic validation (call targets, arg types),
graph acyclicity enforcement, revision hashes for flattened configs.

Deployment Manager: add deployment ID + idempotency, per-instance operation lock
covering all mutating commands, state transition matrix, site-side apply atomicity
(all-or-nothing), artifact version compatibility policy.

Site Runtime: add script trust model (forbidden APIs, execution timeout, constrained
compilation), concurrency/serialization rules (Instance Actor serializes mutations),
site-wide stream backpressure (per-subscriber buffering, fire-and-forget publish).

Communication: add application-level correlation IDs for protocol safety beyond
Akka.NET transport guarantees.

External System Gateway: add 408/429 as transient errors, CachedCall idempotency
note, dedicated dispatcher for blocking I/O isolation.

Health Monitoring: add monotonic sequence numbers to prevent stale report overwrites.

Security: require LDAPS/StartTLS for LDAP connections.

Central UI: add failover behavior (SignalR reconnect, JWT survives, shared Data
Protection keys, load balancer readiness).

Cluster Infrastructure: add down-if-alone=on for safe singleton ownership.

Site Event Logging: clarify active-node-only logging (no replication), add 1GB
storage cap with oldest-first purge.

Host: add readiness gating (health check endpoint, no traffic until operational).

Commons: add message contract versioning policy (additive-only evolution).

Configuration Database: add optimistic concurrency on deployment status records.
2026-03-16 09:06:12 -04:00
Joseph Doherty
70e5ae33d5 Refine remaining components: Deployment Manager, Central UI, Site Event Logging, S&F
Deployment Manager: add deployment concurrency rules (block same-instance, allow
parallel different-instance), per-site artifact deployment status, current-only
status persistence.

Central UI: specify Blazor Server framework, real-time push updates via SignalR
for debug view, health dashboard, and deployment status.

Site Event Logging: daily retention purge, paginated queries with 500-event default,
keyword search on message/source fields.

Store-and-Forward: clarify async best-effort replication to standby with acceptable
trade-offs on failover.
2026-03-16 08:48:33 -04:00
Joseph Doherty
a540912782 Refine Notification Service: SMTP config, OAuth2, delivery behavior, error handling
Expand SMTP configuration with OAuth2 Client Credentials support for Microsoft 365,
connection timeout, and max concurrent connections. Single email per send with all
recipients in BCC. Plain text only. Classify SMTP errors: transient (4xx/connection)
to S&F, permanent (5xx) returned to script. No app-level rate limiting.
2026-03-16 08:38:38 -04:00
Joseph Doherty
cd03b77913 Refine Inbound API: HTTP contract, extended types, logging, rate limiting
Define POST /api/{methodName} URL structure with X-API-Key header. Flat JSON
request/response with no envelope wrapper. Add extended type system (Object, List)
for complex API parameters and return values, applied to both Inbound API and
External System Gateway method definitions. Only failures logged; no rate limiting
in this controlled industrial environment.
2026-03-16 08:26:04 -04:00
Joseph Doherty
cbc78465e0 Refine Security & Auth: LDAP bind, JWT sessions, idle timeout, failure handling
Replace Windows Integrated Auth with direct LDAP bind (username/password login form).
Add JWT-based sessions with HMAC-SHA256 shared key for load balancer compatibility.
15-minute token refresh re-queries LDAP for current group memberships. 30-minute
configurable idle timeout. LDAP failure: new logins fail, active sessions continue
with current roles until LDAP recovers.
2026-03-16 08:16:29 -04:00
Joseph Doherty
57eae0c1db Refine Health Monitoring: timing defaults, offline detection, error rate calculation
Set 30-second report interval with 60-second absolute timeout for offline detection.
Define error rates as raw counts per interval (reset after each report). Script errors
include all failure types. Automatic online recovery on first received report. Flat
snapshot report structure.
2026-03-16 08:10:16 -04:00
Joseph Doherty
3dd62adf42 Refine Cluster Infrastructure: split-brain, seed nodes, failure detection, dual recovery
Add keep-oldest split-brain resolver with 15s stable-after duration. Configure both
nodes as seed nodes for symmetric startup. Set moderate failure detection defaults
(2s heartbeat, 10s threshold, ~25s total failover). Document automatic dual-node
recovery from persistent storage with no manual intervention.
2026-03-16 08:07:28 -04:00
Joseph Doherty
bd735de8c4 Refine Communication Layer: timeouts, transport config, ordering, failure behavior
Add per-pattern message timeouts with sensible defaults (120s for deployments, 30s
for queries/commands). Configure Akka.NET transport heartbeat explicitly rather than
relying on framework defaults. Document per-site message ordering guarantee. Specify
that in-flight messages on disconnect result in timeout error (no central buffering)
and debug streams die on any disconnect.
2026-03-16 08:04:06 -04:00
Joseph Doherty
1ef316f32c Add dual call modes for external systems: synchronous Call() and cached CachedCall()
Scripts now choose per invocation whether an external system call is synchronous
(all failures return to script) or cached (transient failures go to store-and-forward).
Mirrors the existing Database.Connection/CachedWrite pattern. Updated ESG, Site
Runtime script API, high-level requirements, and design doc.
2026-03-16 08:00:20 -04:00
Joseph Doherty
5fff1712a8 Refine External System Gateway: protocol, auth, timeouts, error classification
Specify HTTP/REST with JSON as the invocation protocol. Add API key and Basic Auth
as outbound authentication modes. Add per-system call timeouts. Classify errors by
HTTP status for store-and-forward decisions (5xx/transient → retry, 4xx → permanent
error to script). Document ADO.NET connection pooling for database connections.
Update Store-and-Forward to clarify transient-only buffering.
2026-03-16 07:57:00 -04:00
Joseph Doherty
19c7e6880f Refine Data Connection Layer: error handling, reconnection, write failures, health reporting
Add connection lifecycle (fixed-interval auto-reconnect, immediate bad quality on
disconnect, transparent re-subscribe), synchronous write failure errors to scripts,
periodic tag path resolution retry, and enhanced health reporting with tag resolution
counts. Update cross-references in Health Monitoring and Site Runtime.
2026-03-16 07:51:37 -04:00
Joseph Doherty
f0108e161b Housekeeping: remove stale AuditLogging component, add Codex MCP tool usage note
Audit logging was absorbed into the Configuration Database component (IAuditService),
making the separate Component-AuditLogging.md redundant. Also added tool usage
guidance to CLAUDE.md for Codex MCP model selection.
2026-03-16 07:45:18 -04:00
Joseph Doherty
1944f94fed Initial design docs from claude.ai refinement sessions 2026-03-16 07:39:26 -04:00