Commit Graph

34 Commits

Author SHA1 Message Date
Joseph Doherty
d3194e3634 feat: separate create/edit form pages, Playwright test infrastructure, /auth/token endpoint
Move all CRUD create/edit forms from inline on list pages to dedicated form pages
with back-button navigation and post-save redirect. Add Playwright Docker container
(browser server on port 3000) with 25 passing E2E tests covering login, navigation,
and site CRUD workflows. Add POST /auth/token endpoint for clean JWT retrieval.
2026-03-21 15:17:24 -04:00
Joseph Doherty
b3f8850711 docs: document script hot-reload mechanisms for all script types 2026-03-21 13:42:06 -04:00
Joseph Doherty
416a03b782 feat: complete gRPC streaming channel — site host, docker config, docs, integration tests
Switch site host to WebApplicationBuilder with Kestrel HTTP/2 gRPC server,
add GrpcPort/keepalive config, wire SiteStreamManager as ISiteStreamSubscriber,
expose gRPC ports in docker-compose, add site seed script, update all 10
requirement docs + CLAUDE.md + README.md for the new dual-transport architecture.
2026-03-21 12:38:33 -04:00
Joseph Doherty
b76ce09221 docs: add gRPC streaming channel implementation plan with task tracking 2026-03-21 11:32:24 -04:00
Joseph Doherty
3efec91386 fix: route debug stream events through ClusterClient site→central path
ClusterClient Sender refs are temporary proxies — valid for immediate reply
but not durable for future Tells. Events now flow as DebugStreamEvent through
SiteCommunicationActor → ClusterClient → CentralCommunicationActor → bridge
actor (same pattern as health reports). Also fix DebugStreamHub to use
IHubContext for long-lived callbacks instead of transient hub instance.
2026-03-21 11:32:17 -04:00
Joseph Doherty
41aff339b2 docs: add gRPC streaming channel design plan for site→central real-time data
Replaces ClusterClient-based event streaming with dedicated gRPC server-streaming
channels. Covers proto definition, server/client patterns, Channel<T> bridging,
keepalive/orphan prevention, failover scenarios, port/address configuration,
extensibility guide for new event types, testing strategy, and implementation guardrails.
2026-03-21 11:26:09 -04:00
Joseph Doherty
fd2e96fea2 feat: replace debug view polling with real-time SignalR streaming
The debug view polled every 2s by re-subscribing for full snapshots. Now a
persistent DebugStreamBridgeActor on central subscribes once and receives
incremental Akka stream events from the site, forwarding them to the Blazor
component via callbacks and to the CLI via a new SignalR hub at
/hubs/debug-stream. Adds `debug stream` CLI command with auto-reconnect.
2026-03-21 01:34:53 -04:00
Joseph Doherty
d91aa83665 refactor(docs): move requirements and test infra docs into docs/ subdirectories
Organize documentation by moving requirements (HighLevelReqs, Component-*,
lmxproxy_protocol) to docs/requirements/ and test infrastructure docs to
docs/test_infra/. Updates all cross-references in README, CLAUDE.md,
infra/README, component docs, and 23 plan files.
2026-03-21 01:11:35 -04:00
Joseph Doherty
3e93a0d8c3 docs: add LmxFakeProxy implementation plan with 10 tasks
Detailed task-by-task plan covering scaffolding, TagMapper, SessionManager,
OpcUaBridge, ScadaServiceImpl, Program.cs, Docker, docs, and integration test.
2026-03-19 11:13:51 -04:00
Joseph Doherty
e19a568b9b docs: add LmxFakeProxy design — OPC UA-backed test proxy for LmxProxy protocol
Defines a gRPC server implementing the scada.ScadaService proto that bridges
to the existing OPC UA test server. Enables end-to-end testing of
RealLmxProxyClient without a Windows LmxProxy deployment.
2026-03-19 11:08:47 -04:00
Joseph Doherty
c36de676f3 Add implementation plan: Management Service + CLI 2026-03-17 14:35:52 -04:00
Joseph Doherty
54c03a3139 Add implementation plan: deploy artifacts, remove config DB dependency 2026-03-17 13:35:54 -04:00
Joseph Doherty
75ccd4b1c0 Add design doc: deploy artifacts to sites, remove config DB dependency 2026-03-17 13:30:23 -04:00
Joseph Doherty
2b2cc0a151 All phases complete: execution checklists for Phases 3C through 8
All 11 phases (0, 1, 2, 3A, 3B, 3C, 4, 5, 6, 7, 8) implemented.
781 tests passing across 20 test projects. Zero build warnings.
2026-03-16 22:19:29 -04:00
Joseph Doherty
b659978764 Phase 8: Production readiness — failover tests, security hardening, sandboxing, deployment docs
- WP-1-3: Central/site failover + dual-node recovery tests (17 tests)
- WP-4: Performance testing framework for target scale (7 tests)
- WP-5: Security hardening (LDAPS, JWT key length, no secrets in logs) (11 tests)
- WP-6: Script sandboxing adversarial tests (28 tests, all forbidden APIs)
- WP-7: Recovery drill test scaffolds (5 tests)
- WP-8: Observability validation (structured logs, correlation IDs, metrics) (6 tests)
- WP-9: Message contract compatibility (forward/backward compat) (18 tests)
- WP-10: Deployment packaging (installation guide, production checklist, topology)
- WP-11: Operational runbooks (failover, troubleshooting, maintenance)
92 new tests, all passing. Zero warnings.
2026-03-16 22:12:31 -04:00
Joseph Doherty
b75bf52fb4 Phase 3B complete: 35 WPs, 11/11 gate criteria, 541 tests passing 2026-03-16 20:57:46 -04:00
Joseph Doherty
a3bf0c43f3 Phase 3A complete: 8 WPs, 13/13 gate criteria, 389 tests passing 2026-03-16 20:35:24 -04:00
Joseph Doherty
4896ac8ae9 Phase 2 complete: 29 WPs implemented, Template Engine fully functional
Template modeling, flattening, validation, diff, and deployment contract all
operational. 173 TemplateEngine tests + 186 prior = 359 total, all passing.
9/9 verification gate criteria pass.
2026-03-16 20:13:04 -04:00
Joseph Doherty
dab8b061b5 Phase 1 complete: execution checklist with all 22 WPs and 20 gate criteria passing 2026-03-16 19:51:49 -04:00
Joseph Doherty
9bc5a5163f Phase 0 complete: update execution checklist with all gates passing
12/12 WPs complete, 57/57 tests passing, 100/100 requirements verified,
13/13 design constraints verified. All verification gate criteria pass.
2026-03-16 19:00:15 -04:00
Joseph Doherty
8c2091dc0a Phase 0 WP-0.10–0.12: Host skeleton, options classes, sample configs, and execution framework
- WP-0.10: Role-based Host startup (Central=WebApplication, Site=generic Host),
  15 component AddXxx() extension methods, MapCentralUI/MapInboundAPI stubs
- WP-0.11: 12 per-component options classes with config binding
- WP-0.12: Sample appsettings for central and site topologies
- Add execution procedure and checklist template to generate_plans.md
- Add phase-0-checklist.md for execution tracking
- Resolve all 21 open questions from plan generation
- Update IDataConnection with batch ops and IAsyncDisposable
57 tests pass, zero warnings.
2026-03-16 18:59:07 -04:00
Joseph Doherty
021817930b Generate all 11 phase implementation plans with bullet-level requirement traceability
All phases (0-8) now have detailed implementation plans with:
- Bullet-level requirement extraction from HighLevelReqs sections
- Design constraint traceability (KDD + Component Design)
- Work packages with acceptance criteria mapped to every requirement
- Split-section ownership verified across phases
- Orphan checks (forward, reverse, negative) all passing
- Codex MCP (gpt-5.4) external verification completed per phase

Total: 7,549 lines across 11 plan documents, ~160 work packages,
~400 requirements traced, ~25 open questions logged for follow-up.
2026-03-16 15:34:54 -04:00
Joseph Doherty
a9fa74d5ac Document LmxProxy protocol in DCL, strengthen plan generation traceability guards, and add UI constraints
- Replace "custom protocol" placeholder with full LmxProxy details (gRPC transport, SDK API mapping, session management, keep-alive, TLS, batch ops)
- Add bullet-level requirement traceability, design constraint traceability (52 KDD + 6 CD), split-section tracking, and post-generation orphan check to plan framework
- Resolve Q9 (LmxProxy), Q11 (REST test server), Q13 (solo dev), Q14 (self-test), Q15 (Machine Data DB out of scope)
- Set Central UI constraints: Blazor Server + Bootstrap only, no heavy frameworks, custom components, clean corporate design
2026-03-16 15:08:57 -04:00
Joseph Doherty
652378b470 Add test infrastructure with Docker services, CLI tools, and resolve Phase 0 questions
Stand up local dev infrastructure (OPC UA, LDAP, MS SQL) with Docker Compose,
Python CLI tools for service interaction, and teardown script. Fix GLAuth config
mount, OPC PLC node format, and document actual DN/namespace behavior discovered
during testing. Resolve Q1-Q8,Q10: .NET 10, Akka.NET 1.5.x, monorepo with slnx,
appsettings JWT, Windows Server 2022 site target.
2026-03-16 14:03:12 -04:00
Joseph Doherty
7a0bd0f701 Create implementation plan generation framework
generate_plans.md: Master plan defining 10 phases (0, 1, 2, 3A, 3B, 3C, 4-8)
with component assignments, sub-tasks, testable outcomes, and HighLevelReqs
coverage. Phase 3 split into 3A (runtime foundation + failover), 3B (site I/O
+ observability), 3C (deployment pipeline + S&F) per Codex review. Failover
testing embedded in runtime phases, not deferred to hardening.

requirements-traceability.md: Full matrix mapping all 54 HighLevelReqs sections
and 22 REQ-* identifiers to implementation phases. Zero unmapped requirements.

questions.md: 15 open questions requiring follow-up before/during implementation
(tooling, environments, team, integration targets).
2026-03-16 09:59:23 -04:00
Joseph Doherty
a540912782 Refine Notification Service: SMTP config, OAuth2, delivery behavior, error handling
Expand SMTP configuration with OAuth2 Client Credentials support for Microsoft 365,
connection timeout, and max concurrent connections. Single email per send with all
recipients in BCC. Plain text only. Classify SMTP errors: transient (4xx/connection)
to S&F, permanent (5xx) returned to script. No app-level rate limiting.
2026-03-16 08:38:38 -04:00
Joseph Doherty
cd03b77913 Refine Inbound API: HTTP contract, extended types, logging, rate limiting
Define POST /api/{methodName} URL structure with X-API-Key header. Flat JSON
request/response with no envelope wrapper. Add extended type system (Object, List)
for complex API parameters and return values, applied to both Inbound API and
External System Gateway method definitions. Only failures logged; no rate limiting
in this controlled industrial environment.
2026-03-16 08:26:04 -04:00
Joseph Doherty
cbc78465e0 Refine Security & Auth: LDAP bind, JWT sessions, idle timeout, failure handling
Replace Windows Integrated Auth with direct LDAP bind (username/password login form).
Add JWT-based sessions with HMAC-SHA256 shared key for load balancer compatibility.
15-minute token refresh re-queries LDAP for current group memberships. 30-minute
configurable idle timeout. LDAP failure: new logins fail, active sessions continue
with current roles until LDAP recovers.
2026-03-16 08:16:29 -04:00
Joseph Doherty
57eae0c1db Refine Health Monitoring: timing defaults, offline detection, error rate calculation
Set 30-second report interval with 60-second absolute timeout for offline detection.
Define error rates as raw counts per interval (reset after each report). Script errors
include all failure types. Automatic online recovery on first received report. Flat
snapshot report structure.
2026-03-16 08:10:16 -04:00
Joseph Doherty
3dd62adf42 Refine Cluster Infrastructure: split-brain, seed nodes, failure detection, dual recovery
Add keep-oldest split-brain resolver with 15s stable-after duration. Configure both
nodes as seed nodes for symmetric startup. Set moderate failure detection defaults
(2s heartbeat, 10s threshold, ~25s total failover). Document automatic dual-node
recovery from persistent storage with no manual intervention.
2026-03-16 08:07:28 -04:00
Joseph Doherty
bd735de8c4 Refine Communication Layer: timeouts, transport config, ordering, failure behavior
Add per-pattern message timeouts with sensible defaults (120s for deployments, 30s
for queries/commands). Configure Akka.NET transport heartbeat explicitly rather than
relying on framework defaults. Document per-site message ordering guarantee. Specify
that in-flight messages on disconnect result in timeout error (no central buffering)
and debug streams die on any disconnect.
2026-03-16 08:04:06 -04:00
Joseph Doherty
1ef316f32c Add dual call modes for external systems: synchronous Call() and cached CachedCall()
Scripts now choose per invocation whether an external system call is synchronous
(all failures return to script) or cached (transient failures go to store-and-forward).
Mirrors the existing Database.Connection/CachedWrite pattern. Updated ESG, Site
Runtime script API, high-level requirements, and design doc.
2026-03-16 08:00:20 -04:00
Joseph Doherty
5fff1712a8 Refine External System Gateway: protocol, auth, timeouts, error classification
Specify HTTP/REST with JSON as the invocation protocol. Add API key and Basic Auth
as outbound authentication modes. Add per-system call timeouts. Classify errors by
HTTP status for store-and-forward decisions (5xx/transient → retry, 4xx → permanent
error to script). Document ADO.NET connection pooling for database connections.
Update Store-and-Forward to clarify transient-only buffering.
2026-03-16 07:57:00 -04:00
Joseph Doherty
19c7e6880f Refine Data Connection Layer: error handling, reconnection, write failures, health reporting
Add connection lifecycle (fixed-interval auto-reconnect, immediate bad quality on
disconnect, transparent re-subscribe), synchronous write failure errors to scripts,
periodic tag path resolution retry, and enhanced health reporting with tag resolution
counts. Update cross-references in Health Monitoring and Site Runtime.
2026-03-16 07:51:37 -04:00