From 81339633d9c8cad873e501a72ef9da4e6e97aa03 Mon Sep 17 00:00:00 2001 From: Joseph Doherty Date: Sun, 26 Apr 2026 15:19:17 -0400 Subject: [PATCH] Add gateway implementation planning docs --- .gitignore | 148 ++++ AGENTS.md | 157 +++- StyleGuide.md | 282 +++++++ docs/client-libraries-design.md | 389 ++++++++++ docs/clients-dotnet-csharp-design.md | 193 +++++ docs/clients-golang-design.md | 172 +++++ docs/clients-java-design.md | 191 +++++ docs/clients-python-design.md | 191 +++++ docs/clients-rust-design.md | 183 +++++ docs/design-decisions.md | 309 ++++++++ docs/gateway-dashboard-design.md | 364 +++++++++ docs/gateway-process-design.md | 774 ++++++++++++++++++++ docs/implementation-plan-clients.md | 387 ++++++++++ docs/implementation-plan-gateway.md | 511 +++++++++++++ docs/implementation-plan-index.md | 100 +++ docs/implementation-plan-mxaccess-worker.md | 450 ++++++++++++ docs/mxaccess-worker-instance-design.md | 636 ++++++++++++++++ docs/style-guides/CSharpStyleGuide.md | 76 ++ docs/style-guides/GoStyleGuide.md | 68 ++ docs/style-guides/JavaStyleGuide.md | 65 ++ docs/style-guides/ProtobufStyleGuide.md | 64 ++ docs/style-guides/PythonStyleGuide.md | 68 ++ docs/style-guides/RustStyleGuide.md | 65 ++ docs/toolchain-links.md | 172 +++++ gateway.md | 65 +- 25 files changed, 6069 insertions(+), 11 deletions(-) create mode 100644 .gitignore create mode 100644 StyleGuide.md create mode 100644 docs/client-libraries-design.md create mode 100644 docs/clients-dotnet-csharp-design.md create mode 100644 docs/clients-golang-design.md create mode 100644 docs/clients-java-design.md create mode 100644 docs/clients-python-design.md create mode 100644 docs/clients-rust-design.md create mode 100644 docs/design-decisions.md create mode 100644 docs/gateway-dashboard-design.md create mode 100644 docs/gateway-process-design.md create mode 100644 docs/implementation-plan-clients.md create mode 100644 docs/implementation-plan-gateway.md create mode 100644 docs/implementation-plan-index.md create mode 100644 docs/implementation-plan-mxaccess-worker.md create mode 100644 docs/mxaccess-worker-instance-design.md create mode 100644 docs/style-guides/CSharpStyleGuide.md create mode 100644 docs/style-guides/GoStyleGuide.md create mode 100644 docs/style-guides/JavaStyleGuide.md create mode 100644 docs/style-guides/ProtobufStyleGuide.md create mode 100644 docs/style-guides/PythonStyleGuide.md create mode 100644 docs/style-guides/RustStyleGuide.md create mode 100644 docs/toolchain-links.md diff --git a/.gitignore b/.gitignore new file mode 100644 index 0000000..c6c3a4a --- /dev/null +++ b/.gitignore @@ -0,0 +1,148 @@ +# OS files +.DS_Store +.DS_Store? +._* +Thumbs.db +ehthumbs.db +Desktop.ini +$RECYCLE.BIN/ + +# Editor and IDE state +.vs/ +.vscode/* +!.vscode/extensions.json +!.vscode/launch.json +!.vscode/settings.json +!.vscode/tasks.json +.idea/ +*.suo +*.user +*.userosscache +*.sln.docstates +*.rsuser +*.DotSettings.user + +# Local environment and secrets +.env +.env.* +!.env.example +*.local +*.secret +*.secrets +secrets.json + +# Logs and crash dumps +*.log +logs/ +*.dmp +*.dump +*.mdmp + +# Build artifacts +artifacts/ +dist/ +build/ +out/ +tmp/ +temp/ + +# .NET +**/bin/ +**/obj/ +TestResults/ +*.trx +*.coverage +*.coveragexml +coverage/ +packages/ +*.nupkg +*.snupkg +project.lock.json +project.assets.json +*.nuget.props +*.nuget.targets + +# Go +clients/go/bin/ +*.test +coverage.out +coverage.html +go.work.sum + +# Rust +target/ +**/target/ +*.profraw + +# Python +__pycache__/ +*.py[cod] +*$py.class +.Python +.venv/ +venv/ +env/ +ENV/ +.pytest_cache/ +.ruff_cache/ +.mypy_cache/ +.pyre/ +.tox/ +.nox/ +.coverage +.coverage.* +htmlcov/ +pip-wheel-metadata/ +*.egg-info/ +*.egg + +# Java, Maven, and Gradle +.gradle/ +**/target/ +**/build/ +*.class +*.jar +!**/gradle/wrapper/gradle-wrapper.jar +*.war +*.ear +*.nar +hs_err_pid* +replay_pid* + +# Node tooling, used by frontend or documentation tools if added +node_modules/ +npm-debug.log* +yarn-debug.log* +yarn-error.log* +pnpm-debug.log* +.parcel-cache/ +.next/ +.nuxt/ +.svelte-kit/ + +# Protobuf and generated build scratch +*.protobin +*.protodesc +*.pb.tmp +generated-scratch/ + +# Local database and service state +*.db +*.db-shm +*.db-wal +*.sqlite +*.sqlite3 +*.bak +*.ldf +*.mdf + +# Archives and packages produced locally +*.zip +*.7z +*.tar +*.tar.gz +*.tgz +*.rar + +# Keep empty directories with .gitkeep files when needed +!.gitkeep diff --git a/AGENTS.md b/AGENTS.md index 447b709..ed7d5f1 100644 --- a/AGENTS.md +++ b/AGENTS.md @@ -7,6 +7,14 @@ without requiring those clients to load MXAccess COM, run as x86, or own an STA message pump. Treat the installed MXAccess COM component as the compatibility baseline. +Toolchain paths, versions, and external analysis locations are recorded in +`docs/toolchain-links.md`. Use that file before searching for compilers, +runtimes, protobuf tools, MXAccess notes, or Galaxy Repository SQL notes. + +Implementation planning is recorded in `docs/implementation-plan-index.md`. +Follow the order there unless the user explicitly reprioritizes: gateway first, +MXAccess worker instance second, clients third. + ## Core Contract Preserve MXAccess behavior first: @@ -54,6 +62,25 @@ default design. - Worker process model: one external client session maps to one worker by default. +## Style Guides + +Follow the project documentation guide and the language guide for every changed +area: + +| Area | Style guide | +|------|-------------| +| Documentation | `StyleGuide.md` | +| Gateway, worker, .NET client, and C# tests | `docs/style-guides/CSharpStyleGuide.md` | +| Public gRPC and worker IPC contracts | `docs/style-guides/ProtobufStyleGuide.md` | +| Go client | `docs/style-guides/GoStyleGuide.md` | +| Rust client | `docs/style-guides/RustStyleGuide.md` | +| Python client | `docs/style-guides/PythonStyleGuide.md` | +| Java client | `docs/style-guides/JavaStyleGuide.md` | + +When a change crosses languages, apply every affected style guide. Generated +code follows its generator output; do not hand-edit it to match handwritten +style. + ## Expected Layout Prefer this structure unless there is a strong reason to adjust it: @@ -70,6 +97,7 @@ src/MxGateway.Server/ Sessions/ Workers/ Grpc/ + Dashboard/ Metrics/ src/MxGateway.Worker/ @@ -90,6 +118,21 @@ src/MxGateway.Worker.Tests/ src/MxGateway.IntegrationTests/ optional live MXAccess tests + +clients/dotnet/ + .NET 10 C# client library, test CLI, and tests + +clients/go/ + Go client module, test CLI, and tests + +clients/rust/ + Rust client crate, test CLI, and tests + +clients/python/ + Python client package, test CLI, and tests + +clients/java/ + Java client library, test CLI, and tests ``` The contracts project may multi-target, or the `.proto` files may be shared as @@ -159,6 +202,79 @@ Command replies should include protocol status, COM HRESULT if available, MXAccess return values, method-specific out parameters, and status arrays where the MXAccess method emits them. +## Galaxy Repository SQL Discovery + +Galaxy tags, hierarchy, and attribute details can be queried from the AVEVA / +Wonderware System Platform Galaxy Repository SQL Server database. Use this as a +discovery and metadata path only; runtime MXAccess parity still belongs to the +MXAccess-backed worker unless an explicit non-parity backend is being designed. + +Full notes, schema details, screenshots, and query examples are in: + +```text +C:\Users\dohertj2\Desktop\lmxopcua\gr +``` + +Important files in that notes directory: + +- `connectioninfo.md` - SQL Server connection details and `sqlcmd` usage. +- `layout.md` - hierarchy vs `tag_name` relationship. +- `build_layout_plan.md` - extraction plan for hierarchy and attributes. +- `schema.md` and `ddl/` - Galaxy Repository schema reference. +- `queries/hierarchy.sql` - deployed object hierarchy. +- `queries/attributes.sql` - user-defined dynamic attributes. +- `queries/attributes_extended.sql` - system plus user-defined attributes. +- `queries/change_detection.sql` - deployment-change polling via + `galaxy.time_of_last_deploy`. + +Current documented connection is SQL Server `localhost`, database `ZB`, Windows +Auth. Example: + +```powershell +sqlcmd -S localhost -d ZB -E -Q "SELECT time_of_last_deploy FROM galaxy;" +``` + +Key tables from the notes are `gobject`, `template_definition`, +`dynamic_attribute`, `attribute_definition`, `primitive_instance`, and +`galaxy`. The hierarchy uses contained names for human-readable browsing, while +runtime tag references use globally unique `tag_name` values such as +`.`. + +## MXAccess Analysis Source + +Use the local MXAccess analysis project when answering questions about installed +MXAccess classes, interfaces, fields, events, HRESULT/status behavior, value +projection, captures, and parity gaps: + +```text +C:\Users\dohertj2\Desktop\mxaccess +``` + +Primary files: + +- `README.md` - overview of available analysis and capture artifacts. +- `docs/MXAccess-Public-API.md` - COM class, ProgID, CLSID, method list, + event signatures, `MxDataType`, `MxStatus`, and `MXSTATUS_PROXY`. +- `docs/MXAccess-Reverse-Engineering.md` - installed runtime path and x86 COM + constraints. +- `docs/Current-Sprint-State.md` and `docs/DotNet10-Native-Library-Plan.md` - + current parity gaps and managed native-client research status. +- `src/MxTraceHarness/` - x86 MXAccess harness examples using the real COM + interop assembly. +- `captures/` and `analysis/` - observed native behavior and generated + reverse-engineering artifacts. + +Concrete MXAccess COM target from the analysis: + +- class: `ArchestrA.MxAccess.LMXProxyServerClass` +- CLSID: `{C30B52F5-2CB5-4760-AF0A-3A344A7EB5DC}` +- ProgID: `LMXProxy.LMXProxyServer.1` +- version-independent ProgID: `LMXProxy.LMXProxyServer` +- registered server: `C:\Program Files (x86)\ArchestrA\Framework\Bin\LmxProxy.dll` +- interop assembly: + `C:\Program Files (x86)\ArchestrA\Framework\Bin\ArchestrA.MXAccess.dll` +- threading model: `Apartment` + ## Worker Rules Each worker owns: @@ -210,6 +326,7 @@ identity and launched worker identity. Prefer a per-session nonce handshake. The gateway is responsible for: - public TCP/gRPC API, +- Blazor Server dashboard using Bootstrap CSS/JS only, - authn/authz when needed, - session creation and teardown, - worker launch and lifecycle management, @@ -223,6 +340,11 @@ The gRPC layer should stay thin: validate request, find session, call the session worker client, map worker replies to public replies, and stream events. Keep MXAccess-specific translation logic testable outside the gRPC handlers. +Dashboard code should also stay thin and read-only for v1. Use a snapshot +service over session/worker/metrics state; do not let Razor components mutate +gateway sessions or workers directly. Do not use MudBlazor or other Blazor UI +component libraries. + Gateway restart should not try to reattach old workers in the first version. Terminate orphaned workers on startup if that behavior is implemented. @@ -305,6 +427,40 @@ Known important parity areas: behavior. - STA message pumping is required for event delivery. +## Source Update Workflow + +When source code changes, build the affected component before handing work +back. If the change crosses component boundaries, build each affected component +instead of relying on a single top-level build. + +Use the native build and test command for each changed area: + +| Changed area | Required verification | +|--------------|-----------------------| +| Contracts or `.proto` files | regenerate generated code, then build gateway, worker, and every generated client touched by the contract | +| Gateway server, sessions, workers, gRPC, dashboard, or metrics | build the .NET 10 gateway project and run affected gateway or fake-worker tests | +| Worker IPC, STA, MXAccess, or conversion code | build the .NET Framework 4.8 x86 worker project and run affected worker tests | +| Shared test infrastructure | run every test suite that consumes the changed helpers | +| .NET client | build the .NET client library, CLI, and tests | +| Go client | run Go formatting, build, and tests for the Go module | +| Rust client | run Rust formatting, build or check, and tests for the Rust crate | +| Python client | run Python formatting or linting if configured, package/build checks, and tests | +| Java client | build the Java client library, CLI, and tests | +| Integration tests | run them only when the required MXAccess COM component, provider state, and external services are available; otherwise document why they were skipped | + +Update affected documentation in the same change as the source update. This +includes `gateway.md`, component design docs under `docs/`, client docs, API +contract notes, test instructions, and operational guidance. Documentation must +follow `StyleGuide.md`: write technical present-tense prose, explain the reason +for non-obvious choices, use exact code names, specify languages on code +blocks, use relative links for internal docs, and avoid stale temporary notes. +Source code and contract changes must also follow the relevant language guide +from the Style Guides section. + +Do not leave documentation describing old behavior after changing public APIs, +contracts, configuration, build steps, security behavior, event shapes, value +conversion, status mapping, lifecycle rules, or client semantics. + ## Implementation Priority Build the smallest end-to-end slice first: @@ -323,4 +479,3 @@ Build the smallest end-to-end slice first: That slice proves the high-risk requirements: process isolation, STA ownership, message pumping, command routing, and event streaming. - diff --git a/StyleGuide.md b/StyleGuide.md new file mode 100644 index 0000000..ad60857 --- /dev/null +++ b/StyleGuide.md @@ -0,0 +1,282 @@ +# Documentation Style Guide + +This guide defines writing conventions and formatting rules for all ScadaBridge documentation. + +## Tone and Voice + +### Be Technical and Direct + +Write for developers who are familiar with .NET. Don't explain basic concepts like dependency injection or async/await unless they're used in an unusual way. + +**Good:** +> The `ScadaGatewayActor` routes messages to the appropriate `ScadaClientActor` based on the client ID in the message. + +**Avoid:** +> The ScadaGatewayActor is a really powerful component that helps manage all your SCADA connections efficiently! + +### Explain "Why" Not Just "What" + +Document the reasoning behind patterns and decisions, not just the mechanics. + +**Good:** +> Health checks use a 5-second timeout because actors under heavy load may take several seconds to respond, but longer delays indicate a real problem. + +**Avoid:** +> Health checks use a 5-second timeout. + +### Use Present Tense + +Describe what the code does, not what it will do. + +**Good:** +> The actor validates the message before processing. + +**Avoid:** +> The actor will validate the message before processing. + +### No Marketing Language + +This is internal technical documentation. Avoid superlatives and promotional language. + +**Avoid:** "powerful", "robust", "cutting-edge", "seamless", "blazing fast" + +## Formatting Rules + +### File Names + +Use `PascalCase.md` for all documentation files: +- `Overview.md` +- `HealthChecks.md` +- `StateMachines.md` +- `SignalR.md` + +### Headings + +- **H1 (`#`):** Document title only, Title Case +- **H2 (`##`):** Major sections, Title Case +- **H3 (`###`):** Subsections, Sentence case +- **H4+ (`####`):** Rarely needed, Sentence case + +```markdown +# Actor Health Checks + +## Configuration Options + +### Setting the timeout + +#### Default values +``` + +### Code Blocks + +Always specify the language: + +````markdown +```csharp +public class MyActor : ReceiveActor { } +``` + +```json +{ + "Setting": "value" +} +``` + +```bash +dotnet build +``` +```` + +Supported languages: `csharp`, `json`, `bash`, `xml`, `sql`, `yaml`, `html`, `css`, `javascript` + +### Code Snippets + +**Length:** 5-25 lines is typical. Shorter for simple concepts, longer for complete examples. + +**Context:** Include enough to understand where the code lives: + +```csharp +// Good - shows class context +public class TemplateInstanceActor : ReceiveActor +{ + public TemplateInstanceActor(TemplateInstanceConfig config) + { + Receive(Handle); + } +} + +// Avoid - orphaned snippet +Receive(Handle); +``` + +**Accuracy:** Only use code that exists in the codebase. Never invent examples. + +### Lists + +Use bullet points for unordered items: +```markdown +- First item +- Second item +- Third item +``` + +Use numbers for sequential steps: +```markdown +1. Do this first +2. Then do this +3. Finally do this +``` + +### Tables + +Use tables for structured reference information: + +```markdown +| Option | Default | Description | +|--------|---------|-------------| +| `Timeout` | `5000` | Milliseconds to wait | +| `RetryCount` | `3` | Number of retry attempts | +``` + +### Inline Code + +Use backticks for: +- Class names: `ScadaGatewayActor` +- Method names: `HandleMessage()` +- File names: `appsettings.json` +- Configuration keys: `ScadaBridge:Timeout` +- Command-line commands: `dotnet build` + +### Links + +Use relative paths for internal documentation: +```markdown +[See the Actors guide](../Akka/Actors.md) +[Configuration options](./Configuration.md) +``` + +Use descriptive link text: +```markdown + +See the [Actor Health Checks](../Akka/HealthChecks.md) documentation. + + +See [here](../Akka/HealthChecks.md) for more. +``` + +## Structure Conventions + +### Document Opening + +Every document starts with: +1. H1 title +2. 1-2 sentence description of purpose + +```markdown +# Actor Health Checks + +Health checks monitor actor responsiveness and report status to the ASP.NET Core health check system. +``` + +### Section Organization + +Organize content from general to specific: +1. Overview/introduction +2. Key concepts (if needed) +3. Basic usage +4. Advanced usage +5. Configuration +6. Troubleshooting +7. Related documentation + +### Code Example Placement + +Place code examples immediately after the concept they illustrate: + +```markdown +## Message Handling + +Actors process messages using `Receive` handlers: + +```csharp +Receive(msg => HandleMyMessage(msg)); +``` + +Each handler processes one message type... +``` + +### Related Documentation Section + +End each document with links to related topics: + +```markdown +## Related Documentation + +- [Actor Patterns](./Patterns.md) +- [Health Checks](../Operations/HealthChecks.md) +- [Configuration](../Configuration/Akka.md) +``` + +## Naming Conventions + +### Match Code Exactly + +Use the exact names from source code: +- `TemplateInstanceActor` not "Template Instance Actor" +- `ScadaGatewayActor` not "SCADA Gateway Actor" +- `IRequiredActor` not "required actor interface" + +### Acronyms + +Spell out on first use, then use acronym: +> OPC Unified Architecture (OPC UA) provides industrial communication standards. OPC UA servers expose... + +Common acronyms that don't need expansion: +- API +- JSON +- SQL +- HTTP/HTTPS +- REST +- JWT +- UI + +### File Paths + +Use forward slashes and backticks: +- `src/Infrastructure/Akka/Actors/` +- `appsettings.json` +- `Documentation/Akka/Overview.md` + +## What to Avoid + +### Don't Document the Obvious + +```markdown + +## Constructor + +The constructor creates a new instance of the class. + + +## Constructor + +The constructor accepts an `IActorRef` for the gateway actor, which must be resolved before actor creation. +``` + +### Don't Duplicate Source Code Comments + +If code has good comments, reference the file rather than copying: +> See `ScadaGatewayActor.cs` lines 45-60 for the message routing logic. + +### Don't Include Temporary Information + +Avoid dates, version numbers, or "coming soon" notes that will become stale. + +### Don't Over-Explain .NET Basics + +Assume readers know: +- Dependency injection +- async/await +- LINQ +- Entity Framework basics +- ASP.NET Core middleware pipeline diff --git a/docs/client-libraries-design.md b/docs/client-libraries-design.md new file mode 100644 index 0000000..e5e7dad --- /dev/null +++ b/docs/client-libraries-design.md @@ -0,0 +1,389 @@ +# Client Libraries Detailed Design + +## Purpose + +This document defines the shared design for official MXAccess Gateway gRPC +clients. Each supported language should provide: + +- a reusable client library, +- a test CLI built on that library, +- unit tests that run without a live gateway, +- optional integration tests against a live gateway. + +Target client languages: + +- .NET 10 C# +- Go +- Rust +- Python +- Java + +Language-specific plans: + +- `docs/clients-dotnet-csharp-design.md` +- `docs/clients-golang-design.md` +- `docs/clients-rust-design.md` +- `docs/clients-python-design.md` +- `docs/clients-java-design.md` + +Language style guides: + +| Client | Style guide | +|--------|-------------| +| .NET C# | [C# Style Guide](./style-guides/CSharpStyleGuide.md) | +| Go | [Go Style Guide](./style-guides/GoStyleGuide.md) | +| Rust | [Rust Style Guide](./style-guides/RustStyleGuide.md) | +| Python | [Python Style Guide](./style-guides/PythonStyleGuide.md) | +| Java | [Java Style Guide](./style-guides/JavaStyleGuide.md) | +| Generated protobuf/gRPC code | [Protobuf Style Guide](./style-guides/ProtobufStyleGuide.md) | + +## Goals + +Client libraries should make the gateway pleasant to consume without hiding +MXAccess behavior. + +Goals: + +- expose sessions as first-class objects, +- support unary `OpenSession`, `CloseSession`, and `Invoke`, +- support server-streaming `StreamEvents`, +- attach API key auth metadata to every call, +- preserve gateway, worker, COM, HRESULT, and MXAccess status detail, +- provide method-specific command helpers, +- provide raw command escape hatches for parity work, +- provide deterministic test CLIs for smoke and integration testing, +- keep generated protobuf/gRPC code separate from handwritten wrappers. + +Non-goals for v1: + +- client-side reconnectable sessions, +- client-side event replay, +- client-side command batching, +- synthetic MXAccess events, +- hiding MXAccess handles behind opaque client-only handles. + +## Public Client Concepts + +All languages should expose the same core concepts, using idiomatic naming: + +- gateway client, +- session, +- command request, +- command reply, +- event stream, +- MXAccess event, +- MX value, +- MX status proxy, +- gateway error, +- client options. + +The gateway session id and MXAccess handles must remain visible. The library may +offer helper methods, but it must not invent alternate handle semantics. + +## Shared API Shape + +Each language should support this conceptual API: + +```text +client = GatewayClient.connect(endpoint, apiKey, options) +session = client.openSession(options) + +serverHandle = session.register(clientName) +itemHandle = session.addItem(serverHandle, itemReference) +session.advise(serverHandle, itemHandle) + +events = session.streamEvents() +session.write(serverHandle, itemHandle, value, userId) + +session.close() +client.close() +``` + +Each library should also expose lower-level calls: + +```text +client.openSession(rawRequest) +client.closeSession(rawRequest) +client.invoke(rawCommandRequest) +client.streamEvents(rawStreamRequest) +``` + +## Authentication + +The gateway uses API key auth for v1. Clients should support: + +```text +authorization: Bearer mxgw__ +``` + +Rules: + +- Do not log API keys. +- Redact keys in CLI error output. +- Allow API key from command line, environment variable, or config object. +- Recommended environment variable: `MXGATEWAY_API_KEY`. +- Attach auth metadata to every unary and streaming call. +- Treat `Unauthenticated` and `PermissionDenied` distinctly. + +## TLS + +Clients should support: + +- plaintext for local development, +- TLS with system roots, +- TLS with custom CA file, +- optional server name override for test environments. + +Default should be secure for packaged production examples, but the test CLI may +default to plaintext when endpoint is `localhost` or `127.0.0.1`. + +## Timeouts And Cancellation + +Each client library should support: + +- connect timeout, +- unary call timeout, +- command timeout passed to gateway when the public API supports it, +- stream cancellation, +- graceful session close timeout. + +Language wrappers should map cancellation to the native ecosystem: + +- .NET: `CancellationToken` +- Go: `context.Context` +- Rust: `tokio` cancellation / dropped future plus explicit timeout +- Python: `asyncio` task cancellation and deadlines +- Java: `Deadline`, `CompletableFuture`, and stream cancellation + +Canceling a client call does not imply the worker COM call was aborted. Client +docs and errors must make that clear. + +## Error Model + +Each client should distinguish: + +- transport errors, +- authentication/authorization errors, +- gateway session errors, +- worker process/protocol errors, +- MXAccess command failures, +- COM HRESULT/status failures. + +Generated gRPC errors should not be the only error surface. The wrapper should +return rich command replies when the gateway reached MXAccess and MXAccess +returned HRESULT/status information. + +Recommended high-level error categories: + +```text +TransportError +AuthenticationError +AuthorizationError +SessionError +WorkerError +ProtocolError +CommandError +MxAccessError +TimeoutError +CancelledError +``` + +## Values + +Each language should provide ergonomic conversion helpers for `MxValue`: + +- bool, +- int32, +- int64, +- float, +- double, +- string, +- timestamp, +- typed arrays, +- raw variant fallback. + +The raw protobuf value should always remain accessible. + +Do not lose raw variant metadata when conversion is incomplete. For CLI output, +render both typed projection and raw metadata when present. + +## Events + +Each client should expose event streaming as the idiomatic streaming primitive: + +- .NET: `IAsyncEnumerable` +- Go: receive loop over generated stream +- Rust: `Stream>` +- Python: async iterator +- Java: blocking iterator and async observer variants + +Events must preserve gateway order. Libraries should not reorder, coalesce, or +drop events by default. + +The event surface must include: + +- `OnDataChange` +- `OnWriteComplete` +- `OperationComplete` +- `OnBufferedDataChange` +- terminal session fault when represented as a message + +`OperationComplete` is forwarded only when native MXAccess raises it. +`OnBufferedDataChange` payload conversion may include raw metadata until live +multi-sample buffered payloads are fully validated. + +## Test CLI Contract + +Each language should include a test CLI that exercises the library. The CLI is +not the production gateway server. + +Required commands: + +```text +version +ping +open-session +close-session +register +add-item +advise +stream-events +write +write2 +smoke +``` + +Optional commands: + +```text +add-item2 +add-buffered-item +set-buffered-update-interval +authenticate-user +write-secured +write-secured2 +get-worker-info +metadata-query +``` + +Common CLI flags: + +```text +--endpoint +--api-key +--api-key-env +--plaintext +--tls +--ca-file +--session-id +--client-name +--server-handle +--item-handle +--item +--context +--value +--type +--timeout +--json +--verbose +``` + +The `smoke` command should: + +1. open a session, +2. register a client name, +3. add one item, +4. advise it, +5. optionally write a value, +6. stream events for a bounded duration, +7. close the session. + +CLI output should support JSON for automated tests. + +## Unit Tests + +Unit tests must run without a live gateway. Use fake gRPC services, mock +transports, or generated test servers depending on language. + +Required unit test areas: + +- options parsing, +- auth metadata injection, +- TLS/plaintext channel setup, +- method-specific request construction, +- value conversion, +- status conversion, +- command reply error mapping, +- stream event iteration, +- stream cancellation, +- timeout behavior, +- CLI argument parsing, +- CLI JSON output redaction of secrets. + +## Integration Tests + +Integration tests are optional and should be opt-in. They may require a live +gateway and installed MXAccess on the gateway host. + +Recommended environment variables: + +```text +MXGATEWAY_ENDPOINT +MXGATEWAY_API_KEY +MXGATEWAY_TEST_ITEM +MXGATEWAY_TEST_CONTEXT +MXGATEWAY_TEST_WRITE_VALUE +MXGATEWAY_INTEGRATION=1 +``` + +Integration tests should skip unless `MXGATEWAY_INTEGRATION=1`. + +## Repository Layout + +Recommended top-level layout: + +```text +clients/ + dotnet/ + go/ + rust/ + python/ + java/ +``` + +Each client should contain: + +```text +src or package source +generated protobuf/grpc source +test CLI +unit tests +README.md +examples/ +``` + +Generated code should be reproducible from `src/MxGateway.Contracts/Protos/`. +Do not hand-edit generated code. + +## Versioning + +All clients should expose: + +- client library version, +- supported gateway protocol version, +- generated protobuf version if available. + +Version compatibility should be tested against protocol-version mismatch cases. + +## Documentation + +Each client README should include: + +- install instructions, +- minimal open/register/add/advise example, +- API key configuration, +- TLS configuration, +- CLI examples, +- integration test instructions, +- warning that canceling a client call does not abort an in-flight MXAccess COM + call. diff --git a/docs/clients-dotnet-csharp-design.md b/docs/clients-dotnet-csharp-design.md new file mode 100644 index 0000000..900b63c --- /dev/null +++ b/docs/clients-dotnet-csharp-design.md @@ -0,0 +1,193 @@ +# .NET 10 C# Client Detailed Design + +## Purpose + +Provide an idiomatic .NET 10 C# client library for MXAccess Gateway, plus a test +CLI and unit tests. This client is for modern .NET callers and must not load +MXAccess COM. + +Follow the [C# Style Guide](./style-guides/CSharpStyleGuide.md) for +handwritten code and the [Protobuf Style Guide](./style-guides/ProtobufStyleGuide.md) +for generated contract inputs. + +## Projects + +Recommended layout: + +```text +clients/dotnet/ + MxGateway.Client/ + MxGateway.Client.csproj + GatewayClient.cs + MxGatewaySession.cs + MxGatewayClientOptions.cs + Authentication/ + Conversion/ + Errors/ + Generated/ + MxGateway.Client.Cli/ + MxGateway.Client.Cli.csproj + Program.cs + Commands/ + MxGateway.Client.Tests/ + MxGateway.Client.Tests.csproj + MxGateway.Client.IntegrationTests/ + MxGateway.Client.IntegrationTests.csproj +``` + +Target framework: + +```xml +net10.0 +``` + +Expected packages: + +- `Grpc.Net.Client` +- `Google.Protobuf` +- `Grpc.Tools` for generation +- `Microsoft.Extensions.Logging.Abstractions` +- `System.CommandLine` or similar for CLI +- test framework: xUnit or NUnit + +## Library API + +Suggested public types: + +```csharp +public sealed class MxGatewayClient : IAsyncDisposable +{ + public static MxGatewayClient Create(MxGatewayClientOptions options); + public Task OpenSessionAsync( + OpenSessionOptions? options = null, + CancellationToken cancellationToken = default); + public Task InvokeAsync( + MxCommandRequest request, + CancellationToken cancellationToken = default); +} + +public sealed class MxGatewaySession : IAsyncDisposable +{ + public string SessionId { get; } + + public Task RegisterAsync(string clientName, CancellationToken ct = default); + public Task UnregisterAsync(int serverHandle, CancellationToken ct = default); + public Task AddItemAsync(int serverHandle, string item, CancellationToken ct = default); + public Task AddItem2Async(int serverHandle, string item, string context, CancellationToken ct = default); + public Task AdviseAsync(int serverHandle, int itemHandle, CancellationToken ct = default); + public Task UnAdviseAsync(int serverHandle, int itemHandle, CancellationToken ct = default); + public Task WriteAsync(int serverHandle, int itemHandle, MxValue value, int userId, CancellationToken ct = default); + public IAsyncEnumerable StreamEventsAsync(CancellationToken ct = default); + public Task CloseAsync(CancellationToken ct = default); +} +``` + +Generated protobuf types should remain available under a generated namespace. +Handwritten wrappers should not hide raw replies. + +## Options + +```csharp +public sealed class MxGatewayClientOptions +{ + public required Uri Endpoint { get; init; } + public required string ApiKey { get; init; } + public bool UseTls { get; init; } + public string? CaCertificatePath { get; init; } + public string? ServerNameOverride { get; init; } + public TimeSpan ConnectTimeout { get; init; } = TimeSpan.FromSeconds(10); + public TimeSpan DefaultCallTimeout { get; init; } = TimeSpan.FromSeconds(30); + public ILoggerFactory? LoggerFactory { get; init; } +} +``` + +API key may be loaded from `MXGATEWAY_API_KEY` by the CLI, not implicitly by the +library constructor unless a helper explicitly says it does that. + +## Auth Interceptor + +Use a gRPC call credentials/interceptor layer to attach: + +```text +authorization: Bearer +``` + +The interceptor must redact the key in logs and exceptions. + +## Streaming + +Expose `StreamEventsAsync` as `IAsyncEnumerable`. On cancellation, +cancel the gRPC stream and surface `OperationCanceledException` only when the +caller initiated cancellation. + +Do not reorder events. + +## Error Handling + +Recommended exceptions: + +```csharp +MxGatewayException +MxGatewayAuthenticationException +MxGatewayAuthorizationException +MxGatewaySessionException +MxGatewayWorkerException +MxGatewayCommandException +MxAccessException +``` + +For command replies that include MXAccess HRESULT/status, prefer returning the +reply and exposing helper methods: + +```csharp +reply.EnsureProtocolSuccess(); +reply.EnsureMxAccessSuccess(); +``` + +## Test CLI + +Project: `MxGateway.Client.Cli`. + +Command examples: + +```powershell +mxgw-dotnet version +mxgw-dotnet smoke --endpoint http://localhost:5000 --api-key-env MXGATEWAY_API_KEY --item TestChildObject.TestInt +mxgw-dotnet stream-events --session-id --json +mxgw-dotnet write --session-id --server-handle 1 --item-handle 1 --type int32 --value 123 +``` + +The CLI should use `System.CommandLine` or a similarly testable parser. JSON +output should be deterministic and redact API keys. + +## Unit Tests + +Use an in-process fake gRPC service with `Grpc.AspNetCore.Server` test host or +mock the generated client behind an internal interface. + +Required tests: + +- auth metadata is attached, +- API key is redacted, +- options build plaintext and TLS channels correctly, +- `RegisterAsync` builds the right command payload, +- `AddItem2Async` includes context, +- `WriteAsync` converts scalar and array values, +- command reply status helpers preserve MXAccess HRESULT, +- `StreamEventsAsync` yields ordered events, +- stream cancellation disposes the call, +- CLI parsing and JSON output. + +## Integration Tests + +Use xUnit traits or categories. Skip unless: + +```text +MXGATEWAY_INTEGRATION=1 +MXGATEWAY_ENDPOINT= +MXGATEWAY_API_KEY= +MXGATEWAY_TEST_ITEM= +``` + +Integration smoke should open, register, add, advise, stream for bounded time, +and close. diff --git a/docs/clients-golang-design.md b/docs/clients-golang-design.md new file mode 100644 index 0000000..f4d07b6 --- /dev/null +++ b/docs/clients-golang-design.md @@ -0,0 +1,172 @@ +# Go Client Detailed Design + +## Purpose + +Provide an idiomatic Go client module for MXAccess Gateway, plus a test CLI and +unit tests. The Go client should be suitable for services and command-line +automation. + +Follow the [Go Style Guide](./style-guides/GoStyleGuide.md) for handwritten +code and the [Protobuf Style Guide](./style-guides/ProtobufStyleGuide.md) for +generated contract inputs. + +## Module Layout + +Recommended layout: + +```text +clients/go/ + go.mod + mxgateway/ + client.go + session.go + options.go + auth.go + values.go + errors.go + internal/generated/ + mxaccess_gateway.pb.go + mxaccess_gateway_grpc.pb.go + cmd/mxgw-go/ + main.go + tests/ +``` + +Generated code should come from `protoc` plus: + +- `protoc-gen-go` +- `protoc-gen-go-grpc` + +## Library API + +Suggested API: + +```go +type Client struct { + // owns grpc.ClientConn +} + +type Options struct { + Endpoint string + APIKey string + Plaintext bool + CACertFile string + ServerNameOverride string + DialTimeout time.Duration + CallTimeout time.Duration +} + +func Dial(ctx context.Context, opts Options) (*Client, error) +func (c *Client) OpenSession(ctx context.Context, opts OpenSessionOptions) (*Session, error) +func (c *Client) Invoke(ctx context.Context, req *pb.MxCommandRequest) (*pb.MxCommandReply, error) +func (c *Client) Close() error +``` + +Session: + +```go +type Session struct { + ID string +} + +func (s *Session) Register(ctx context.Context, clientName string) (int32, error) +func (s *Session) Unregister(ctx context.Context, serverHandle int32) error +func (s *Session) AddItem(ctx context.Context, serverHandle int32, item string) (int32, error) +func (s *Session) AddItem2(ctx context.Context, serverHandle int32, item, context string) (int32, error) +func (s *Session) Advise(ctx context.Context, serverHandle, itemHandle int32) error +func (s *Session) Write(ctx context.Context, serverHandle, itemHandle int32, value Value, userID int32) error +func (s *Session) Events(ctx context.Context) (<-chan EventResult, error) +func (s *Session) Close(ctx context.Context) error +``` + +## Authentication + +Use a unary and stream interceptor to attach: + +```text +authorization: Bearer +``` + +The interceptor should use `metadata.AppendToOutgoingContext` or call options. +Do not print API keys in errors. + +## TLS + +Support: + +- `credentials/insecure` for local plaintext, +- `credentials.NewClientTLSFromFile`, +- custom `tls.Config` for advanced callers. + +## Streaming + +`Events(ctx)` should return a receive channel of: + +```go +type EventResult struct { + Event *pb.MxEvent + Err error +} +``` + +The receive goroutine exits on stream end, context cancellation, or error. The +channel should be closed exactly once. Do not reorder events. + +## Error Handling + +Expose typed errors: + +```go +type GatewayError struct { ... } +type CommandError struct { ... } +type MxAccessError struct { ... } +``` + +Use `errors.Is` / `errors.As` support. Preserve raw protobuf replies on command +errors. + +## Test CLI + +Binary: `mxgw-go`. + +Recommended commands: + +```text +mxgw-go version +mxgw-go smoke --endpoint localhost:5000 --api-key-env MXGATEWAY_API_KEY --plaintext --item TestChildObject.TestInt +mxgw-go write --session-id --server-handle 1 --item-handle 1 --type int32 --value 123 +mxgw-go stream-events --session-id --json +``` + +Recommended CLI library: + +- standard `flag` for minimalism, or +- Cobra if subcommand ergonomics matter. + +## Unit Tests + +Use `bufconn` for in-memory gRPC tests. + +Required tests: + +- auth interceptor on unary calls, +- auth interceptor on streaming calls, +- plaintext and TLS dial options, +- command helper request construction, +- value conversion, +- status conversion, +- typed error wrapping, +- stream channel closes on cancellation, +- late stream error propagation, +- CLI JSON redaction. + +## Integration Tests + +Use Go build tags or environment skip: + +```text +MXGATEWAY_INTEGRATION=1 +``` + +Integration test should run `OpenSession`, `Register`, `AddItem`, `Advise`, +bounded `StreamEvents`, and `CloseSession`. diff --git a/docs/clients-java-design.md b/docs/clients-java-design.md new file mode 100644 index 0000000..f5872d6 --- /dev/null +++ b/docs/clients-java-design.md @@ -0,0 +1,191 @@ +# Java Client Detailed Design + +## Purpose + +Provide a Java client library for MXAccess Gateway, plus a test CLI and unit +tests. The Java client should work for JVM services and operator tooling. + +Follow the [Java Style Guide](./style-guides/JavaStyleGuide.md) for handwritten +code and the [Protobuf Style Guide](./style-guides/ProtobufStyleGuide.md) for +generated contract inputs. + +## Build Layout + +Recommended Gradle multi-project layout: + +```text +clients/java/ + settings.gradle + build.gradle + mxgateway-client/ + build.gradle + src/main/java/com/dohertylan/mxgateway/client/ + src/test/java/com/dohertylan/mxgateway/client/ + mxgateway-cli/ + build.gradle + src/main/java/com/dohertylan/mxgateway/cli/ +``` + +Alternative Maven layout is acceptable if the repo standardizes on Maven. + +Target Java: + +- Java 21 recommended. + +Expected dependencies: + +- `grpc-netty-shaded` +- `grpc-protobuf` +- `grpc-stub` +- `protobuf-java` +- `picocli` +- `junit-jupiter` +- `mockito` if needed + +## Library API + +Suggested API: + +```java +public final class MxGatewayClient implements AutoCloseable { + public static MxGatewayClient connect(MxGatewayClientOptions options); + public MxGatewaySession openSession(OpenSessionOptions options); + public MxCommandReply invoke(MxCommandRequest request); + public CompletableFuture invokeAsync(MxCommandRequest request); + public void close(); +} + +public final class MxGatewaySession implements AutoCloseable { + public String sessionId(); + public int register(String clientName); + public void unregister(int serverHandle); + public int addItem(int serverHandle, String item); + public int addItem2(int serverHandle, String item, String context); + public void advise(int serverHandle, int itemHandle); + public void write(int serverHandle, int itemHandle, MxValue value, int userId); + public Iterator streamEvents(); + public void streamEventsAsync(StreamObserver observer); + public void close(); +} +``` + +Expose generated protobuf classes for callers that need raw access. + +## Options + +```java +public final class MxGatewayClientOptions { + URI endpoint; + String apiKey; + boolean plaintext; + Path caCertificatePath; + String serverNameOverride; + Duration connectTimeout; + Duration callTimeout; +} +``` + +## Authentication + +Use a gRPC `ClientInterceptor` to attach: + +```text +authorization: Bearer +``` + +Redact API keys in `toString`, logs, and CLI output. + +## TLS + +Support: + +- plaintext for local development, +- TLS with default JVM trust store, +- custom CA certificate file, +- server name override for test environments. + +## Streaming + +Support both: + +- blocking iterator for simple CLIs, +- async `StreamObserver` for services. + +Do not reorder events. Stream cancellation should call `ClientCall.cancel`. + +## Error Handling + +Recommended exceptions: + +```java +MxGatewayException +MxGatewayAuthenticationException +MxGatewayAuthorizationException +MxGatewaySessionException +MxGatewayWorkerException +MxGatewayCommandException +MxAccessException +``` + +`MxGatewayCommandException` should carry the raw command reply when available. + +## Test CLI + +Binary wrapper name: + +```text +mxgw-java +``` + +Use `picocli`. + +Commands: + +```text +mxgw-java version +mxgw-java smoke --endpoint localhost:5000 --api-key-env MXGATEWAY_API_KEY --plaintext --item TestChildObject.TestInt +mxgw-java stream-events --session-id --json +mxgw-java write --session-id --server-handle 1 --item-handle 1 --type int32 --value 123 +``` + +JSON output can use Jackson or protobuf JSON formatting. Keep it deterministic. + +## Unit Tests + +Use JUnit 5. + +Use `InProcessServerBuilder` and `InProcessChannelBuilder` for fake gRPC tests. + +Required tests: + +- auth interceptor attaches metadata, +- key redaction, +- plaintext and TLS channel setup, +- request construction helpers, +- value conversion, +- status/error mapping, +- blocking event stream iteration, +- async stream observer cancellation, +- CLI parsing, +- JSON output. + +## Integration Tests + +Skip unless: + +```text +MXGATEWAY_INTEGRATION=1 +``` + +Use JUnit assumptions. Integration flow should open, register, add, advise, +stream for bounded time, and close. + +## Packaging + +Publish library and CLI separately: + +- `mxgateway-client` jar, +- `mxgateway-cli` runnable distribution. + +Generated protobuf code should be produced during the build from shared proto +files and should not be hand-edited. diff --git a/docs/clients-python-design.md b/docs/clients-python-design.md new file mode 100644 index 0000000..3722477 --- /dev/null +++ b/docs/clients-python-design.md @@ -0,0 +1,191 @@ +# Python Client Detailed Design + +## Purpose + +Provide an async Python client package for MXAccess Gateway, plus a test CLI and +unit tests. The Python client should be useful for automation, diagnostics, and +test harnesses. + +Follow the [Python Style Guide](./style-guides/PythonStyleGuide.md) for +handwritten code and the [Protobuf Style Guide](./style-guides/ProtobufStyleGuide.md) +for generated contract inputs. + +## Package Layout + +Recommended layout: + +```text +clients/python/ + pyproject.toml + src/mxgateway/ + __init__.py + client.py + session.py + options.py + auth.py + values.py + errors.py + generated/ + src/mxgateway_cli/ + __main__.py + commands.py + tests/ +``` + +Expected dependencies: + +- `grpcio` +- `grpcio-tools` +- `protobuf` +- `click` or `typer` +- `pytest` +- `pytest-asyncio` + +## Library API + +Use async-first API. A sync wrapper can be added later if needed. + +Suggested API: + +```python +client = await GatewayClient.connect( + endpoint="localhost:5000", + api_key=api_key, + plaintext=True, +) + +session = await client.open_session() +server = await session.register("python-client") +item = await session.add_item(server, "TestChildObject.TestInt") +await session.advise(server, item) + +async for event in session.stream_events(): + ... + +await session.close() +await client.close() +``` + +Classes: + +```python +class GatewayClient: + @classmethod + async def connect(cls, options: ClientOptions) -> "GatewayClient": ... + async def open_session(self, options: OpenSessionOptions | None = None) -> "Session": ... + async def invoke(self, request: MxCommandRequest) -> MxCommandReply: ... + async def close(self) -> None: ... + +class Session: + session_id: str + async def register(self, client_name: str) -> int: ... + async def add_item(self, server_handle: int, item: str) -> int: ... + async def add_item2(self, server_handle: int, item: str, context: str) -> int: ... + async def advise(self, server_handle: int, item_handle: int) -> None: ... + async def write(self, server_handle: int, item_handle: int, value: MxValueInput, user_id: int = 0) -> None: ... + async def stream_events(self) -> AsyncIterator[MxEvent]: ... + async def close(self) -> None: ... +``` + +## Authentication + +Use gRPC metadata: + +```python +metadata = (("authorization", f"Bearer {api_key}"),) +``` + +Provide a metadata helper that all unary and streaming calls use. Redact API +keys in exceptions and CLI output. + +## TLS + +Support: + +- insecure channel for local development, +- TLS channel with default roots, +- custom root certificate file. + +## Streaming + +Expose `stream_events` as an async iterator. Canceling the task should cancel +the gRPC stream. + +Do not hide stream errors. Convert common auth/session errors into typed +exceptions. + +## Error Handling + +Define typed exceptions: + +```python +MxGatewayError +MxGatewayTransportError +MxGatewayAuthenticationError +MxGatewayAuthorizationError +MxGatewaySessionError +MxGatewayWorkerError +MxGatewayCommandError +MxAccessError +``` + +`MxGatewayCommandError` should include the raw protobuf reply when available. + +## Test CLI + +Entry point: + +```text +mxgw-py +``` + +Recommended commands: + +```text +mxgw-py version +mxgw-py smoke --endpoint localhost:5000 --api-key-env MXGATEWAY_API_KEY --plaintext --item TestChildObject.TestInt +mxgw-py stream-events --session-id --json +mxgw-py write --session-id --server-handle 1 --item-handle 1 --type int32 --value 123 +``` + +Use `click` or `typer`. JSON output should be stable for test automation. + +## Unit Tests + +Use `pytest` and `pytest-asyncio`. + +Use fake generated stubs or an in-process test gRPC server where practical. + +Required tests: + +- API key metadata injection, +- API key redaction, +- insecure and TLS channel option construction, +- request construction for method helpers, +- value conversion from Python values, +- status/error mapping, +- async event iteration, +- stream cancellation, +- CLI parsing, +- JSON output. + +## Integration Tests + +Skip unless: + +```text +MXGATEWAY_INTEGRATION=1 +``` + +Use bounded smoke flow and always attempt `close_session` in `finally`. + +## Packaging + +Use `pyproject.toml`. Publishable package name should be stable, for example: + +```text +mxaccess-gateway-client +``` + +Generated protobuf code should be regenerated through a documented command, not +edited by hand. diff --git a/docs/clients-rust-design.md b/docs/clients-rust-design.md new file mode 100644 index 0000000..4587ef3 --- /dev/null +++ b/docs/clients-rust-design.md @@ -0,0 +1,183 @@ +# Rust Client Detailed Design + +## Purpose + +Provide an async Rust client crate for MXAccess Gateway, plus a test CLI and +unit tests. The Rust client should use `tonic` and `tokio`. + +Follow the [Rust Style Guide](./style-guides/RustStyleGuide.md) for handwritten +code and the [Protobuf Style Guide](./style-guides/ProtobufStyleGuide.md) for +generated contract inputs. + +## Crate Layout + +Recommended layout: + +```text +clients/rust/ + Cargo.toml + build.rs + crates/ + mxgateway-client/ + src/lib.rs + src/client.rs + src/session.rs + src/options.rs + src/auth.rs + src/value.rs + src/error.rs + src/generated/ + mxgw-cli/ + src/main.rs + tests/ +``` + +Expected dependencies: + +- `tonic` +- `prost` +- `prost-types` +- `tokio` +- `tokio-stream` +- `thiserror` +- `clap` +- `serde` +- `serde_json` +- `tracing` + +## Library API + +Suggested API: + +```rust +pub struct GatewayClient { /* tonic channel + generated client */ } + +pub struct ClientOptions { + pub endpoint: String, + pub api_key: String, + pub plaintext: bool, + pub ca_file: Option, + pub server_name_override: Option, + pub connect_timeout: Duration, + pub call_timeout: Duration, +} + +impl GatewayClient { + pub async fn connect(options: ClientOptions) -> Result; + pub async fn open_session(&self, options: OpenSessionOptions) -> Result; + pub async fn invoke(&self, request: MxCommandRequest) -> Result; +} +``` + +Session: + +```rust +pub struct Session { + pub id: String, +} + +impl Session { + pub async fn register(&self, client_name: &str) -> Result; + pub async fn add_item(&self, server_handle: i32, item: &str) -> Result; + pub async fn add_item2(&self, server_handle: i32, item: &str, context: &str) -> Result; + pub async fn advise(&self, server_handle: i32, item_handle: i32) -> Result<(), Error>; + pub async fn write(&self, server_handle: i32, item_handle: i32, value: MxValue, user_id: i32) -> Result<(), Error>; + pub async fn events(&self) -> Result>, Error>; + pub async fn close(&self) -> Result<(), Error>; +} +``` + +## Authentication + +Use a `tonic` interceptor or request extension layer to add: + +```text +authorization: Bearer +``` + +Use `SecretString` or equivalent if a dependency is acceptable. Always redact +API keys in `Debug` output. + +## TLS + +Support: + +- plaintext channel for local development, +- native or rustls TLS depending on project preference, +- custom CA file, +- domain override. + +## Streaming + +Expose event streams as a `Stream>`. Dropping the +stream should cancel the underlying gRPC stream. + +Do not buffer unboundedly in the client. If a helper channel is used, make it +bounded. + +## Error Handling + +Use `thiserror`: + +```rust +pub enum Error { + Transport(tonic::transport::Error), + Status(tonic::Status), + Authentication(String), + Authorization(String), + Session(SessionError), + Worker(WorkerError), + Command(CommandError), + MxAccess(MxAccessError), + Timeout, + Cancelled, +} +``` + +Preserve raw command replies in `CommandError` where applicable. + +## Test CLI + +Binary: `mxgw`. + +Use `clap` derive. + +Commands: + +```text +mxgw version +mxgw smoke --endpoint http://localhost:5000 --api-key-env MXGATEWAY_API_KEY --item TestChildObject.TestInt +mxgw stream-events --session-id --json +mxgw write --session-id --server-handle 1 --item-handle 1 --type int32 --value 123 +``` + +JSON output should use `serde_json`. + +## Unit Tests + +Use a fake `tonic` server started on a local ephemeral port, or abstract the +generated client behind a trait for unit tests. + +Required tests: + +- generated client compiles from proto, +- auth metadata injection, +- TLS/plaintext endpoint construction, +- value conversion, +- command request construction, +- error mapping from `tonic::Status`, +- event stream order, +- stream cancellation, +- CLI parsing, +- JSON redaction. + +## Integration Tests + +Skip unless: + +```text +MXGATEWAY_INTEGRATION=1 +``` + +Use `tokio::test`. Run bounded smoke flow and ensure `CloseSession` is attempted +with `drop` fallback docs, but do not rely on `Drop` for async close. diff --git a/docs/design-decisions.md b/docs/design-decisions.md new file mode 100644 index 0000000..01e78bc --- /dev/null +++ b/docs/design-decisions.md @@ -0,0 +1,309 @@ +# Design Decisions + +This document records current v1 choices for the MXAccess gateway design. These +decisions can change, but implementation should follow them until a later design +update says otherwise. + +## Source References + +Use these local analysis sources when answering MXAccess-specific design or +implementation questions: + +```text +C:\Users\dohertj2\Desktop\mxaccess +C:\Users\dohertj2\Desktop\mxaccess\docs\MXAccess-Public-API.md +C:\Users\dohertj2\Desktop\mxaccess\docs\MXAccess-Reverse-Engineering.md +``` + +Use these local notes for Galaxy Repository SQL metadata: + +```text +C:\Users\dohertj2\Desktop\lmxopcua\gr +``` + +## MXAccess COM Target + +Decision: target the installed MXAccess COM interop surface directly from the +x86 worker. + +Concrete COM details from the MXAccess analysis: + +- Interop assembly: + `C:\Program Files (x86)\ArchestrA\Framework\Bin\ArchestrA.MXAccess.dll` +- Assembly identity: + `ArchestrA.MxAccess, Version=3.2.0.0, PublicKeyToken=23106a86e706d0ae` +- COM class: + `ArchestrA.MxAccess.LMXProxyServerClass` +- CLSID: + `{C30B52F5-2CB5-4760-AF0A-3A344A7EB5DC}` +- ProgID: + `LMXProxy.LMXProxyServer.1` +- Version-independent ProgID: + `LMXProxy.LMXProxyServer` +- Registered server: + `C:\Program Files (x86)\ArchestrA\Framework\Bin\LmxProxy.dll` +- Registry view: + `HKCR\Wow6432Node\CLSID\{C30B52F5-2CB5-4760-AF0A-3A344A7EB5DC}` +- Threading model: + `Apartment` + +Rationale: `LMXProxyServer` is a 32-bit in-process COM server, so a .NET 10 x64 +gateway cannot instantiate it directly. The x86 sidecar worker is the reliable +parity path. + +Implementation guidance: + +- Worker should reference `ArchestrA.MXAccess.dll`. +- Worker should instantiate `new LMXProxyServerClass()` on the dedicated STA. +- Worker should expose the resolved class, ProgID, CLSID, interop assembly + version, and `LmxProxy.dll` path through `GetWorkerInfo` / `WorkerReady`. +- Keep the ProgID/path configurable for diagnostics, but the default should be + the installed MXAccess class above. + +## Session Reconnect + +Decision: no reconnectable sessions for v1. + +One `OpenSession` creates one gateway session and one worker process. The +session ends on `CloseSession`, client disconnect policy, lease expiry, worker +fault, or gateway shutdown. + +Rationale: reconnectable sessions require event replay, orphan ownership, +security checks, and more complicated worker lifetime rules. They are not needed +for the first parity slice. + +## Event Subscribers + +Decision: one active `StreamEvents` subscriber per session for v1. + +A second subscriber should be rejected with a clear session error. Multi-client +fan-out may be added later with explicit backpressure semantics. + +Rationale: one subscriber preserves simple event ordering and failure behavior +while parity is being proven. + +## Authentication + +Decision: API key authentication for the public gateway. + +API keys are stored in a gateway-owned SQLite database. Store hashed API key +secrets only; never store raw key material. + +Recommended client format: + +```text +authorization: Bearer mxgw__ +``` + +Recommended SQLite tables: + +```sql +CREATE TABLE api_keys ( + key_id TEXT PRIMARY KEY, + key_prefix TEXT NOT NULL, + secret_hash BLOB NOT NULL, + display_name TEXT NOT NULL, + scopes TEXT NOT NULL, + created_utc TEXT NOT NULL, + last_used_utc TEXT NULL, + revoked_utc TEXT NULL +); + +CREATE TABLE api_key_audit ( + audit_id INTEGER PRIMARY KEY AUTOINCREMENT, + key_id TEXT NULL, + event_type TEXT NOT NULL, + remote_address TEXT NULL, + created_utc TEXT NOT NULL, + details TEXT NULL +); +``` + +Recommended scopes: + +- `session:open` +- `session:close` +- `invoke:read` +- `invoke:write` +- `invoke:secure` +- `events:read` +- `metadata:read` +- `admin` + +Hashing recommendation: + +- Use HMAC-SHA256 with a gateway-local secret/pepper stored outside SQLite, or + use Argon2id if a suitable dependency is already accepted. +- Compare hashes using constant-time comparison. +- Log only the key id or prefix, not the raw key. + +Storage recommendation: + +- Default SQLite path should be under `ProgramData` or another configured + gateway data directory. +- Apply restrictive filesystem ACLs for the gateway service identity and + administrators. +- Require TLS when the gateway is reachable off-machine. + +## Authorization + +Decision: start with scope checks by command category. + +Suggested mapping: + +- `OpenSession`: `session:open` +- `CloseSession`: `session:close` +- `Register`, `Unregister`, `AddItem`, `AddItem2`, `RemoveItem`, `Advise`, + `UnAdvise`, `AdviseSupervisory`, `AddBufferedItem`, + `SetBufferedUpdateInterval`, `Suspend`, `Activate`: `invoke:read` +- `Write`, `Write2`: `invoke:write` +- `WriteSecured`, `WriteSecured2`, `AuthenticateUser`, + `ArchestrAUserToId`: `invoke:secure` +- `StreamEvents`: `events:read` +- Galaxy SQL metadata endpoints if added: `metadata:read` +- worker shutdown diagnostics and key management: `admin` + +## Worker Process Identity + +Decision: run workers as the gateway service identity for v1. + +Rationale: this avoids early COM/DCOM permission failures and keeps the first +implementation focused on MXAccess parity. The worker launcher should keep an +extension point for a restricted service account later. + +## Event Backpressure + +Decision: fail-fast bounded queues for v1 and parity testing. + +If worker or gateway event queues fill, fault the session. Do not silently drop +or coalesce events in parity mode. + +Rationale: event drops would hide parity defects. Production coalescing by item +handle can be added later as an explicit opt-in mode once event rates are +measured. + +## Event-Rate Target + +Decision: do not set a production event-rate target before measurement. + +For v1, expose queue depth, event rate, stream send latency, and overflow +metrics. Keep bounded queues and fail-fast behavior. Use observed load from live +systems to set a later coalescing or scaling target. + +## Command Batching + +Decision: no public command batching for v1. + +Use one command per request so replies, HRESULTs, status arrays, event ordering, +and failure behavior are easy to compare against direct MXAccess. + +Batch tag registration can be added later if measured setup latency requires it. + +## Graceful Worker Shutdown + +Decision: best-effort cleanup before COM release. + +During graceful shutdown, the worker should attempt: + +1. `UnAdvise` for advised items. +2. `RemoveItem` for active item handles. +3. `Unregister` for active server handles. +4. Event detach. +5. COM release. + +Failures during cleanup should be logged and preserved diagnostically, but the +gateway may still kill the worker after shutdown timeout. + +## OperationComplete + +Decision: model and forward `OperationComplete` only when native MXAccess fires +it. Do not synthesize `OperationComplete` from writes, command replies, ASB +completion queues, or other status frames. + +Rationale: the event signature is known, but the MXAccess analysis has not yet +captured the runtime condition that triggers the public event. Synthesizing it +would risk breaking parity. + +## Buffered Data Change + +Decision: include `OnBufferedDataChange` in the protocol and worker event +model, but treat multi-sample payload conversion as capture-validated work. + +The event signature and native path are known. A live buffered sample batch has +not yet been observed. Until then, preserve raw value, quality, timestamp, data +type, and status metadata whenever conversion is incomplete. + +## Completion-Only Status Mapping + +Decision: preserve completion-only operation-status bytes as raw diagnostic +metadata unless native MXAccess raises a public event or the MXAccess analysis +proves an exact `MXSTATUS_PROXY[]` mapping. + +Do not guess status category/source/detail values for frames that MXAccess does +not expose through its public COM events. + +## API Key Administration + +Decision: v1 API key management is a local administrative CLI/tool, not a +public admin API. + +The tool should support: + +- initialize auth database, +- create key, +- list keys without showing secrets, +- revoke key, +- rotate key, +- print the raw secret exactly once at creation. + +Public gRPC key-management endpoints can be added later only behind `admin` +scope and TLS. + +## SQLite Migrations + +Decision: use simple startup migrations with a `schema_version` table. + +Recommended table: + +```sql +CREATE TABLE schema_version ( + id INTEGER PRIMARY KEY CHECK (id = 1), + version INTEGER NOT NULL, + applied_utc TEXT NOT NULL +); +``` + +Migrations should be idempotent, run inside transactions, and fail gateway +startup if the database is newer than the running binary understands. + +## Web Dashboard + +Decision: host a basic gateway dashboard with Blazor Server and Bootstrap +CSS/JS. + +The dashboard should show gateway health, active sessions, worker instances, +basic metrics, queue depths, and recent faults. It should update in real time +through Blazor Server component updates. + +Allowed UI stack: + +- Blazor Server, +- Bootstrap CSS, +- Bootstrap JavaScript, +- small local CSS. + +Do not use MudBlazor or other Blazor UI component libraries for v1. + +Dashboard access should require API-key-backed dashboard authentication with +`admin` scope when enabled. For local development, anonymous localhost access +may exist only behind an explicit configuration option that defaults to false. + +## Later Revisit Items + +These are explicit post-v1 revisit items, not open blockers: + +- reconnectable sessions, +- multiple event subscribers per session, +- restricted worker service account, +- production coalescing by item handle, +- command batching for high-volume tag setup. diff --git a/docs/gateway-dashboard-design.md b/docs/gateway-dashboard-design.md new file mode 100644 index 0000000..d88d587 --- /dev/null +++ b/docs/gateway-dashboard-design.md @@ -0,0 +1,364 @@ +# Gateway Dashboard Detailed Design + +## Purpose + +The gateway should host a basic web dashboard for operators and developers. The +dashboard is diagnostic and operational visibility only for v1. It should show +gateway health, active MXAccess worker instances, session state, and basic +statistics in real time. + +## Technology Choice + +Decision: Blazor Server with Bootstrap CSS/JS. + +Allowed UI stack: + +- ASP.NET Core Blazor Server, +- Bootstrap CSS, +- Bootstrap JavaScript, +- small local CSS for layout and status styling, +- built-in Blazor components. + +Not allowed for v1: + +- MudBlazor, +- Radzen, +- Syncfusion, +- Telerik, +- other Blazor UI component libraries, +- client-side SPA framework replacement. + +Rationale: Blazor Server keeps the dashboard in the gateway process, avoids a +separate frontend build, and gives real-time UI updates through the Blazor +SignalR circuit. Bootstrap is sufficient for a basic dashboard. + +## Hosting Model + +The dashboard is hosted by `MxGateway.Server` alongside the gRPC API. + +Suggested endpoint layout: + +```text +/dashboard +/dashboard/sessions +/dashboard/sessions/{sessionId} +/dashboard/workers +/dashboard/events +/dashboard/settings +/_blazor +``` + +The app should redirect `/` to `/dashboard` only if the deployment wants the +dashboard as the default web page. Otherwise leave gRPC/API hosting unaffected. + +## High-Level Components + +```text +MxGateway.Server + Dashboard/ + Components/ + App.razor + Routes.razor + Layout/ + DashboardLayout.razor + NavMenu.razor + Pages/ + DashboardHome.razor + SessionsPage.razor + SessionDetailsPage.razor + WorkersPage.razor + EventsPage.razor + SettingsPage.razor + Components/ + MetricCard.razor + SessionTable.razor + WorkerTable.razor + EventRatePanel.razor + FaultList.razor + Services/ + DashboardSnapshotService.cs + DashboardUpdateHub.cs + DashboardAuthorization.cs + Models/ + DashboardSnapshot.cs + SessionSummary.cs + WorkerSummary.cs + MetricSummary.cs +``` + +`DashboardUpdateHub` here means an internal application update service, not a +separate public SignalR hub unless implementation proves one is needed. Blazor +Server already uses SignalR for UI circuits. + +## Dashboard Data Source + +The dashboard should consume read-only snapshots from gateway services: + +- `SessionRegistry`, +- `SessionManager`, +- `WorkerClient`, +- `GatewayMetrics`, +- health checks, +- structured fault/event counters. + +Do not let Razor components directly mutate gateway session or worker objects. +Create a small read-only dashboard service that projects gateway state into +plain DTOs. + +Suggested service: + +```csharp +public interface IDashboardSnapshotService +{ + DashboardSnapshot GetSnapshot(); + IAsyncEnumerable WatchSnapshotsAsync( + CancellationToken cancellationToken); +} +``` + +Snapshot updates can be driven by: + +- periodic timer, default every 1 second, +- session lifecycle notifications, +- worker heartbeat updates, +- event counter updates, +- fault notifications. + +Use immutable snapshot DTOs so Razor components can render without locking +gateway internals. + +## Realtime Updates + +Use Blazor Server component state updates for real-time dashboard refresh. + +Recommended pattern: + +1. Page/component subscribes to `WatchSnapshotsAsync`. +2. Snapshot service emits updates from a bounded channel or timer. +3. Component stores the latest snapshot. +4. Component calls `InvokeAsync(StateHasChanged)`. +5. Component cancels subscription on dispose. + +Default update cadence: + +- immediate update on session create/close/fault, +- immediate update on worker fault, +- periodic metrics refresh every 1 second, +- event-rate windows updated every 1 second. + +Avoid pushing every MXAccess data-change event to the dashboard. Aggregate event +counts and rates instead. + +## Pages + +### Dashboard Home + +Show top-level status: + +- gateway status, +- gateway version, +- uptime, +- open sessions, +- workers running, +- sessions faulted, +- command rate, +- command failure count, +- event rate, +- event queue depth, +- worker restart/kill count. + +Use Bootstrap cards for individual metric summaries. Keep the layout compact +and operational. + +### Sessions Page + +Show active and recent sessions in a table: + +- session id, +- client identity or API key display name, +- state, +- backend, +- worker process id, +- open time, +- last client activity, +- last worker heartbeat, +- active event subscribers, +- pending commands, +- event queue depth, +- last fault summary. + +Rows should link to session details. + +### Session Details Page + +Show: + +- session metadata, +- worker metadata, +- command counters by method, +- event counters by family, +- active server handles and item counts if gateway shadow state has them, +- latest faults, +- last heartbeat payload, +- close/kill controls only if admin actions are later enabled. + +For v1, details should be read-only unless an explicit admin action design is +added. + +### Workers Page + +Show: + +- worker process id, +- session id, +- executable path/version, +- state, +- startup duration, +- memory and CPU if available, +- last heartbeat, +- current command correlation id, +- pending command count, +- event queue depth, +- restart/kill reason if terminal. + +### Events Page + +Show aggregate event diagnostics: + +- event rate by session, +- event rate by event family, +- total events since start, +- queue overflow count, +- stream disconnect count, +- recent terminal faults. + +Do not display full tag values by default. If value display is later added, make +it opt-in and redacted. + +### Settings Page + +Show read-only effective configuration: + +- worker executable path, +- configured timeouts, +- queue capacities, +- auth mode, +- SQLite auth database path with sensitive parts redacted if needed, +- dashboard enabled state, +- protocol version. + +Do not show API key secrets or pepper values. + +## Authentication And Authorization + +Dashboard access should use the same API-key authentication model as gRPC where +practical. + +Recommended v1 behavior: + +- dashboard disabled by default unless configured, +- when enabled, require API key auth, +- require `admin` scope for dashboard access, +- accept API key through a secure cookie established by a simple login form, or + through reverse-proxy/header configuration for local deployments, +- do not put API keys in query strings. + +Simplest implementation path: + +1. Add `/dashboard/login`. +2. User submits API key over HTTPS. +3. Gateway validates key and `admin` scope. +4. Gateway issues an HTTP-only secure auth cookie for the dashboard. +5. Dashboard pages require that cookie. +6. Logout clears the cookie. + +For local development, allow an explicit `Dashboard:AllowAnonymousLocalhost` +option. It must default to false. + +## Configuration + +Suggested configuration: + +```json +{ + "MxGateway": { + "Dashboard": { + "Enabled": true, + "PathBase": "/dashboard", + "RequireAdminScope": true, + "AllowAnonymousLocalhost": false, + "SnapshotIntervalMilliseconds": 1000, + "RecentFaultLimit": 100, + "RecentSessionLimit": 200, + "ShowTagValues": false + } + } +} +``` + +## Security Rules + +- Do not display API key secrets. +- Do not display credential-bearing MXAccess command values. +- Do not display full tag values by default. +- Do not expose worker pipe names with nonce or sensitive details. +- Protect dashboard auth cookies with `HttpOnly`, `Secure`, and `SameSite`. +- Require TLS for remote dashboard access. +- Use anti-forgery protection for login/logout and any future admin actions. + +## Styling + +Use Bootstrap utility classes and a small local stylesheet. + +Recommended visual language: + +- compact tables, +- status badges, +- metric cards, +- Bootstrap alerts for faults, +- restrained colors, +- no decorative hero sections, +- no charting dependency for v1. + +If charts are added later, prefer simple server-generated data tables first. Do +not add a JavaScript charting dependency without a specific need. + +## Testing + +Dashboard unit/component tests should cover: + +- snapshot projection, +- dashboard auth authorization decisions, +- login API-key validation behavior, +- pages render with empty state, +- pages render with active sessions, +- pages render with faulted sessions, +- realtime subscription disposal, +- redaction of API keys and credential values. + +Use bUnit if component testing is added. Otherwise keep the first tests focused +on snapshot services and authorization logic. + +Integration tests should verify: + +- dashboard disabled returns not found or configured fallback, +- dashboard requires auth when enabled, +- admin-scoped key can access dashboard, +- non-admin key is denied, +- live snapshot updates when a fake session changes state. + +## Initial Implementation Slice + +The first dashboard slice should implement: + +1. Blazor Server hosting in `MxGateway.Server`. +2. Bootstrap static assets. +3. dashboard configuration binding. +4. dashboard auth using API key login and HTTP-only cookie. +5. read-only `DashboardSnapshotService`. +6. home page with metric cards. +7. sessions page with active session table. +8. workers page with worker table. +9. 1-second realtime refresh through Blazor Server. +10. redaction tests for secrets. + diff --git a/docs/gateway-process-design.md b/docs/gateway-process-design.md new file mode 100644 index 0000000..d94acb9 --- /dev/null +++ b/docs/gateway-process-design.md @@ -0,0 +1,774 @@ +# Gateway Process Detailed Design + +## Purpose + +The gateway process is the only public network-facing component. It exposes the +modern API, owns session lifecycle, launches and supervises MXAccess worker +processes, and moves commands and events between clients and the worker that +owns each session. + +The gateway must not instantiate MXAccess COM, import MXAccess interop types, or +depend on an STA message pump. The installed MXAccess COM component is isolated +behind the worker process boundary. + +## Runtime + +- Target runtime: .NET 10. +- Language: C#. +- Preferred process architecture: x64. +- Hosting: ASP.NET Core gRPC. +- Web UI: Blazor Server dashboard with Bootstrap CSS/JS. +- Operating system: Windows. +- Public transport: TCP gRPC. +- Internal worker transport: named pipes with protobuf-framed messages. + +Style guides: + +- [C# Style Guide](./style-guides/CSharpStyleGuide.md) +- [Protobuf Style Guide](./style-guides/ProtobufStyleGuide.md) + +## Responsibilities + +The gateway owns: + +- public gRPC service endpoints, +- Blazor Server dashboard endpoints, +- optional authentication and authorization, +- session id allocation, +- worker executable selection, +- named-pipe server creation, +- worker process launch, +- gateway/worker handshake, +- command correlation and timeout handling, +- event fan-out to client streams, +- session lease and heartbeat enforcement, +- worker crash and hang detection, +- metrics and structured logging, +- graceful service shutdown. + +The gateway does not own: + +- MXAccess COM object creation, +- MXAccess method dispatch, +- MXAccess event subscription, +- MXAccess handle generation, +- COM value conversion from native `VARIANT` values. + +Those belong to the worker. + +## High-Level Components + +```text +MxGateway.Server + Program / Host + Configuration + Grpc + MxAccessGatewayService + RequestReplyMapper + EventMapper + Dashboard + Pages + Components + DashboardSnapshotService + DashboardAuthorization + Sessions + SessionManager + GatewaySession + SessionRegistry + SessionLeaseMonitor + Workers + WorkerProcessLauncher + WorkerClient + WorkerPipeTransport + WorkerProtocolReader + WorkerProtocolWriter + WorkerWatchdog + Security + ClientIdentityResolver + CommandAuthorization + Metrics + GatewayMetrics + Diagnostics + HealthChecks +``` + +## Public gRPC Surface + +Start with unary commands plus an event stream: + +```protobuf +service MxAccessGateway { + rpc OpenSession(OpenSessionRequest) returns (OpenSessionReply); + rpc CloseSession(CloseSessionRequest) returns (CloseSessionReply); + rpc Invoke(MxCommandRequest) returns (MxCommandReply); + rpc StreamEvents(StreamEventsRequest) returns (stream MxEvent); +} +``` + +Add this later only after the command and event model is stable: + +```protobuf +rpc Session(stream ClientMessage) returns (stream ServerMessage); +``` + +### OpenSession + +`OpenSession` creates one gateway session and one worker process by default. + +Inputs should include: + +- requested backend, defaulting to `mxaccess-worker`, +- optional client session name, +- optional client correlation id, +- optional timeout policy, +- optional event backpressure policy, +- optional metadata discovery options. + +Outputs should include: + +- session id, +- backend name, +- worker process id when available, +- protocol version, +- server capabilities, +- default timeout values. + +Behavior: + +1. Resolve and authorize the client identity. +2. Allocate a session id. +3. Build a pipe name and random handshake nonce. +4. Create a named-pipe server with restrictive local ACLs. +5. Launch the worker executable with session bootstrap data. +6. Accept the pipe connection within startup timeout. +7. Exchange `GatewayHello` and `WorkerHello`. +8. Wait for `WorkerReady`. +9. Register the session as ready. +10. Return the session details. + +If any step fails, clean up all resources. Kill the worker if it was launched +and did not shut down on its own. + +### CloseSession + +`CloseSession` attempts graceful shutdown and then enforces a kill timeout. + +Behavior: + +1. Mark the session closing. +2. Stop accepting new commands. +3. Notify event streams of terminal session close. +4. Send `WorkerShutdown` when the pipe is still connected. +5. Wait for worker exit up to the configured timeout. +6. Kill the worker process if it remains alive. +7. Remove the session from the registry. + +`CloseSession` should be idempotent. Closing an already closed session should +return a successful close result with the final known state. + +### Invoke + +`Invoke` forwards one MXAccess command to the worker that owns the session. + +Behavior: + +1. Validate the session id. +2. Check session state is `Ready`. +3. Validate the method-specific payload. +4. Authorize the command, especially writes and credential-bearing commands. +5. Assign a gateway correlation id. +6. Write `WorkerCommand` to the worker pipe. +7. Await the correlated `WorkerCommandReply`. +8. Map worker reply to public `MxCommandReply`. + +Request cancellation stops waiting in the gateway. It does not abort an +in-flight COM call. If the command must be hard-canceled, kill the worker and +fault the session. + +### StreamEvents + +`StreamEvents` streams events for one session. + +Initial implementation allows one active stream subscriber per session. A second +subscriber should be rejected with a clear session error. If multiple +subscribers are later supported, they must have independent backpressure +accounting and a clear fan-out policy. + +Behavior: + +1. Validate session id and authorize event access. +2. Attach a stream cursor to the session event channel. +3. Send events in worker sequence order. +4. Stop on client cancellation, session close, or session fault. +5. Emit a terminal status when the session faults if gRPC status alone cannot + preserve the required details. + +The gateway must not reorder events from one worker. + +## Web Dashboard + +The gateway hosts a basic Blazor Server dashboard for operators and developers. +The dashboard is read-only for v1 and should show current gateway/session/worker +state plus basic metrics. + +Technology: + +- Blazor Server, +- Bootstrap CSS, +- Bootstrap JavaScript, +- no MudBlazor, +- no other Blazor client component libraries. + +Suggested routes: + +```text +/dashboard +/dashboard/sessions +/dashboard/sessions/{sessionId} +/dashboard/workers +/dashboard/events +/dashboard/settings +``` + +Dashboard pages: + +- home: gateway status, uptime, session count, worker count, command rate, + event rate, queue depth, recent faults, +- sessions: active/recent session table, +- session details: one session's worker, heartbeat, counters, queues, and fault + summary, +- workers: worker process table and heartbeat details, +- events: aggregate event counters and rates, +- settings: read-only effective configuration with secrets redacted. + +Realtime updates should use Blazor Server component updates from a read-only +snapshot service. Components should subscribe to snapshots and call +`StateHasChanged` through `InvokeAsync`. Do not stream every MXAccess event to +the dashboard; aggregate event rates and counters instead. + +Suggested service shape: + +```csharp +public interface IDashboardSnapshotService +{ + DashboardSnapshot GetSnapshot(); + IAsyncEnumerable WatchSnapshotsAsync( + CancellationToken cancellationToken); +} +``` + +Default refresh policy: + +- immediate update on session create, close, or fault, +- immediate update on worker fault, +- periodic metrics refresh every 1 second, +- event-rate windows updated every 1 second. + +Dashboard access should require API-key-backed authentication with `admin` scope +when enabled. A simple `/dashboard/login` form can validate an API key and issue +an HTTP-only secure cookie for dashboard pages. Do not put API keys in query +strings. Anonymous localhost access may exist only behind an explicit +configuration option that defaults to false. + +## Session State Machine + +```text +Creating + -> StartingWorker + -> WaitingForPipe + -> Handshaking + -> InitializingWorker + -> Ready + -> Closing + -> Closed + +Any non-terminal state + -> Faulted + +Faulted + -> Closed +``` + +### State Rules + +- `Creating`: session id and in-memory state exist, but no worker has launched. +- `StartingWorker`: worker process launch is in progress. +- `WaitingForPipe`: gateway is waiting for the worker to connect to the pipe. +- `Handshaking`: pipe is connected and protocol hello is being verified. +- `InitializingWorker`: worker is connected but has not reported MXAccess ready. +- `Ready`: commands and event streams may run. +- `Closing`: graceful shutdown is in progress. +- `Closed`: resources are released. +- `Faulted`: a non-graceful terminal fault occurred and must be reported to + callers before resources are released. + +Only `Ready` sessions accept new commands. + +## Session Model + +Gateway session state should include: + +- session id, +- client identity, +- backend name, +- worker process id, +- worker executable path and version, +- pipe name, +- pipe connection state, +- open time, +- last client activity time, +- last worker heartbeat time, +- lease expiration, +- command timeout policy, +- startup timeout policy, +- shutdown timeout policy, +- event queue metrics, +- active event stream count, +- final fault if any. + +The worker remains authoritative for MXAccess handles. The gateway may keep a +shadow state for diagnostics, but it must not invent, rewrite, or recycle +MXAccess handles. + +## Worker Launch + +The gateway should launch the worker using explicit configuration: + +- worker executable path, +- worker working directory, +- worker architecture requirement, +- protocol version, +- startup timeout, +- environment variables, +- optional restricted user identity. + +Command-line arguments should include only non-secret bootstrap values: + +```text +--session-id +--pipe-name +--protocol-version +``` + +Prefer passing the handshake nonce via inherited environment or another +protected local mechanism instead of command line when possible. + +Before launch, validate: + +- worker executable exists, +- worker path is under the configured install directory, +- worker file version or product version is acceptable, +- worker is expected to be x86. + +## Worker IPC + +The gateway creates the pipe server before launching the worker. + +Pipe name: + +```text +mxaccess-gateway-{gatewayProcessId}-{sessionId} +``` + +Message framing: + +```text +uint32 little-endian payload_length +payload_length bytes protobuf WorkerEnvelope +``` + +Recommended size limits: + +- default max message size: 16 MiB, +- configurable upper bound for large arrays, +- reject zero-length payloads, +- reject payloads larger than configured maximum before allocation. + +### Envelope Rules + +Every message uses `WorkerEnvelope`: + +- `protocol_version` must match a supported version. +- `session_id` must match the pipe/session. +- `sequence` is monotonic per sender. +- `correlation_id` links commands and replies. +- events use either zero or their own event correlation id. +- protocol faults do not replace MXAccess HRESULT/status details. + +The gateway should treat malformed frames, sequence regressions, and wrong +session ids as protocol faults and close the session. + +## WorkerClient Design + +`WorkerClient` is the gateway-side object that owns one worker connection. + +Suggested public shape: + +```csharp +public interface IWorkerClient : IAsyncDisposable +{ + string SessionId { get; } + int? ProcessId { get; } + WorkerClientState State { get; } + + Task StartAsync(CancellationToken cancellationToken); + Task InvokeAsync( + WorkerCommand command, + TimeSpan timeout, + CancellationToken cancellationToken); + IAsyncEnumerable ReadEventsAsync( + CancellationToken cancellationToken); + Task ShutdownAsync(TimeSpan timeout, CancellationToken cancellationToken); + void Kill(string reason); +} +``` + +Internally it owns: + +- process handle, +- pipe stream, +- read loop, +- write loop, +- bounded outbound command/control channel, +- bounded inbound event channel, +- pending command dictionary keyed by correlation id, +- heartbeat monitor, +- terminal fault source. + +### Read Loop + +The read loop: + +1. Reads one frame. +2. Parses `WorkerEnvelope`. +3. Validates protocol fields. +4. Dispatches by body type: + - `WorkerCommandReply`: completes pending command. + - `WorkerEvent`: enqueues event. + - `WorkerHeartbeat`: updates heartbeat timestamp. + - `WorkerFault`: faults session. +5. Stops when pipe closes or cancellation is requested. + +If the pipe closes while the session is not closing, fault the session. + +### Write Loop + +The write loop serializes all writes to the pipe. No other code should write to +the pipe directly. + +It handles: + +- `GatewayHello`, +- `WorkerCommand`, +- `WorkerCancel`, +- `WorkerShutdown`, +- gateway heartbeat if used. + +The write loop should fail the session if a pipe write fails outside normal +shutdown. + +## Command Correlation + +Each command gets: + +- gateway correlation id, +- method name, +- start timestamp, +- timeout deadline, +- caller cancellation token, +- reply completion source. + +Pending command handling: + +- Add the pending entry before writing the command. +- Remove it exactly once when reply, timeout, cancellation, or session fault + occurs. +- If a late reply arrives after cancellation or timeout, log it with the + correlation id and discard it. +- If the session faults, complete all pending commands with a structured fault. + +Timeouts should not assume the COM call stopped. A timed-out command may still +finish inside the worker. + +## Fault Model + +Fault categories: + +- `StartupFailed` +- `ProtocolMismatch` +- `ProtocolViolation` +- `PipeDisconnected` +- `WorkerExited` +- `HeartbeatExpired` +- `CommandTimeout` +- `WorkerFaulted` +- `GatewayShutdown` +- `AuthorizationFailed` + +Public replies should distinguish: + +- gRPC transport failure, +- gateway/session failure, +- worker protocol failure, +- MXAccess method failure, +- MXAccess HRESULT/status failure. + +Do not hide an MXAccess HRESULT by returning only an RPC error. When MXAccess +was reached and returned status, preserve that status in the command reply. + +## Heartbeats And Leases + +Use separate concepts: + +- worker heartbeat: proves the worker process and pipe loop are alive, +- session lease: proves the client still owns the session, +- command timeout: bounds one command wait, +- startup timeout: bounds worker creation, +- shutdown timeout: bounds graceful stop. + +Suggested defaults for early development: + +- startup timeout: 30 seconds, +- worker heartbeat interval: 5 seconds, +- heartbeat grace: 15 seconds, +- default command timeout: 30 seconds, +- graceful shutdown timeout: 10 seconds, +- idle session lease: configurable, disabled in local development. + +The exact values should be configurable. + +## Event Delivery + +Events flow: + +```text +worker MXAccess event + -> worker outbound event queue + -> worker pipe writer + -> gateway read loop + -> session event channel + -> gRPC StreamEvents +``` + +The gateway should record: + +- worker event sequence, +- gateway receive sequence, +- worker timestamp, +- gateway receive timestamp, +- stream send timestamp if needed for diagnostics. + +Default backpressure policy for parity testing should be fail-fast: + +1. If the session event channel fills, fault the session. +2. Preserve the overflow details in logs and metrics. +3. Do not silently drop data-change events. + +Do not set a production event-rate target before measurement. Emit event rate, +queue depth, stream send latency, and overflow metrics. Later production modes +may support explicit coalescing by item handle as an opt-in behavior. + +The gateway should not synthesize `OperationComplete` from write completion, +command replies, ASB completion queues, or completion-only status frames. Forward +`OperationComplete` only when the worker reports the native MXAccess public +event. + +## Security + +### Public API + +Use API key authentication for v1. Store API keys in a gateway-owned SQLite +database, but store only hashed key secrets. Clients should send keys in gRPC +metadata using: + +```text +authorization: Bearer mxgw__ +``` + +The gateway should split the key into a stable key id and secret component, +load the key record by id, hash the presented secret, and compare using a +constant-time comparison. + +Recommended scopes: + +- `session:open` +- `session:close` +- `invoke:read` +- `invoke:write` +- `invoke:secure` +- `events:read` +- `metadata:read` +- `admin` + +If the gateway is exposed outside the local machine, use TLS. Do not log raw API +keys or raw credential-bearing MXAccess values. + +API key administration for v1 should be a local CLI/tool rather than a public +gRPC admin API. It should initialize the auth database, create keys, list keys +without secrets, revoke keys, rotate keys, and print raw secrets only once at +creation. + +SQLite auth storage should use startup migrations with a `schema_version` table. +Migrations should run inside transactions and fail startup if the database +schema is newer than the running binary understands. + +Commands requiring authorization: + +- writes, +- secured writes, +- authentication commands, +- worker shutdown diagnostics, +- metadata queries if they expose sensitive plant structure. + +### Worker IPC + +Named pipes should be local only. Pipe ACLs should restrict access to: + +- the gateway process identity, +- the launched worker identity, +- administrators only when operationally required. + +The worker must validate `GatewayHello` and the nonce before creating MXAccess. + +## Observability + +Use structured logs with these fields where applicable: + +- session id, +- client identity, +- worker process id, +- pipe name hash or suffix, +- protocol version, +- correlation id, +- command method, +- MXAccess HRESULT, +- MXAccess status summary, +- event family, +- event sequence, +- queue depth, +- elapsed milliseconds. + +Metrics: + +- open sessions, +- workers running, +- worker startup latency, +- command latency by method, +- command failures by method and category, +- event rate by session and family, +- event queue depth, +- worker exits by reason, +- worker kills, +- heartbeat failures, +- gRPC stream disconnects. + +Do not log credential values or full tag values by default. + +## Configuration + +Suggested configuration shape: + +```json +{ + "MxGateway": { + "Authentication": { + "Mode": "ApiKey", + "SqlitePath": "C:\\ProgramData\\MxGateway\\gateway-auth.db", + "PepperSecretName": "MxGateway:ApiKeyPepper", + "RunMigrationsOnStartup": true + }, + "Worker": { + "ExecutablePath": "src/MxGateway.Worker/bin/x86/Release/MxGateway.Worker.exe", + "StartupTimeoutSeconds": 30, + "ShutdownTimeoutSeconds": 10, + "HeartbeatIntervalSeconds": 5, + "HeartbeatGraceSeconds": 15, + "MaxMessageBytes": 16777216 + }, + "Sessions": { + "DefaultCommandTimeoutSeconds": 30, + "MaxSessions": 64, + "AllowMultipleEventSubscribers": false + }, + "Events": { + "QueueCapacity": 10000, + "BackpressurePolicy": "FailFast" + }, + "Dashboard": { + "Enabled": true, + "PathBase": "/dashboard", + "RequireAdminScope": true, + "AllowAnonymousLocalhost": false, + "SnapshotIntervalMilliseconds": 1000, + "RecentFaultLimit": 100, + "RecentSessionLimit": 200, + "ShowTagValues": false + } + } +} +``` + +Do not scatter connection or path constants through implementation code. + +## Galaxy Repository Metadata + +Galaxy hierarchy and tag metadata can be discovered through SQL Server when +needed for browse or diagnostics. The current notes live outside this repo at: + +```text +C:\Users\dohertj2\Desktop\lmxopcua\gr +``` + +Use SQL metadata as discovery data. It does not replace MXAccess-backed runtime +behavior unless an explicit non-parity backend is designed. + +## Testing Strategy + +Gateway tests should be able to run without installed MXAccess by using fake +workers and fake transports. + +Focused tests: + +- session state transitions, +- worker startup failures, +- protocol version mismatch, +- malformed frame handling, +- pending command completion, +- command timeout and late reply handling, +- worker crash handling, +- event ordering, +- event queue overflow, +- `CloseSession` idempotency, +- gRPC mapping for command replies and faults. +- dashboard snapshot projection, +- dashboard auth decisions, +- dashboard redaction, +- dashboard realtime subscription disposal. + +Integration tests with the real worker should be separated from unit tests and +clearly marked because they require Windows, .NET Framework worker output, and +eventually installed MXAccess COM. + +## Initial Implementation Slice + +The first gateway slice should implement: + +1. Host startup and configuration binding. +2. SQLite auth database initialization and migrations. +3. Local API-key administration CLI/tool. +4. API-key authentication and scope checks. +5. `OpenSession`. +6. Worker process launch. +7. Named-pipe handshake. +8. `Invoke` for `Register`, `AddItem`, and `Advise`. +9. `StreamEvents` with one subscriber per session. +10. `CloseSession`. +11. Worker crash and startup failure handling. +12. Event-rate, queue-depth, and overflow metrics. +13. Blazor Server dashboard with Bootstrap assets. +14. Dashboard home, sessions, and workers pages. +15. Dashboard realtime snapshot refresh. +16. Dashboard API-key login with admin-scope check. +17. Basic structured logs. + +This proves the process model before the full command surface is implemented. diff --git a/docs/implementation-plan-clients.md b/docs/implementation-plan-clients.md new file mode 100644 index 0000000..48b967c --- /dev/null +++ b/docs/implementation-plan-clients.md @@ -0,0 +1,387 @@ +# Client Libraries Implementation Plan + +This plan implements the official gRPC clients after the gateway and worker +first slice is stable enough to generate contracts and run smoke tests. + +Primary designs: + +- `docs/client-libraries-design.md` +- `docs/clients-dotnet-csharp-design.md` +- `docs/clients-golang-design.md` +- `docs/clients-rust-design.md` +- `docs/clients-python-design.md` +- `docs/clients-java-design.md` +- `docs/toolchain-links.md` + +## Shared Milestone: client-contracts-and-fixtures + +Goal: make client implementations consistent across languages. + +### Issue: Publish Stable Client Proto Generation Inputs + +Labels: `area:contracts`, `type:feature`, `priority:p0` + +Deliverables: + +- finalized v1 `.proto` files, +- Buf config if used, +- generation documentation for all languages, +- generated-code output directories, +- golden protobuf payload fixtures. + +Acceptance criteria: + +- C#, Go, Rust, Python, and Java generated code can be regenerated, +- generated code is not hand-edited, +- protocol version is visible to clients. + +### Issue: Create Cross-Language Client Behavior Fixtures + +Labels: `area:tests`, `type:test`, `priority:p0` + +Deliverables: + +- JSON fixtures for command replies, +- JSON fixtures for event stream samples, +- value conversion fixtures, +- status conversion fixtures, +- auth error fixtures, +- timeout/cancel expected behavior notes. + +Acceptance criteria: + +- every language can use the same fixture set, +- fixtures include raw fallback values, +- fixtures include MXAccess status arrays and HRESULT. + +## Milestone: clients-dotnet + +Goal: implement the .NET 10 C# client library, test CLI, and tests. + +### Issue: Scaffold .NET Client Projects + +Labels: `area:client-dotnet`, `type:infra`, `priority:p0` + +Deliverables: + +- `clients/dotnet/MxGateway.Client`, +- `clients/dotnet/MxGateway.Client.Cli`, +- `clients/dotnet/MxGateway.Client.Tests`, +- optional integration test project, +- generated protobuf setup. + +Acceptance criteria: + +- `dotnet build` succeeds, +- generated gRPC client code compiles, +- empty tests run. + +### Issue: Implement .NET GatewayClient And Session + +Labels: `area:client-dotnet`, `type:feature`, `priority:p0` + +Deliverables: + +- `MxGatewayClientOptions`, +- `MxGatewayClient`, +- `MxGatewaySession`, +- raw `OpenSession`, `CloseSession`, `Invoke`, `StreamEvents`, +- helpers for `Register`, `AddItem`, `AddItem2`, `Advise`, `Write`. + +Acceptance criteria: + +- API key metadata is attached, +- cancellation token flows to every call, +- raw replies remain accessible, +- session close is explicit and idempotent. + +Tests: + +- fake gRPC service, +- helper request construction, +- cancellation. + +### Issue: Implement .NET Values, Status, Errors, And CLI + +Labels: `area:client-dotnet`, `type:feature`, `priority:p1` + +Deliverables: + +- `MxValue` helper conversions, +- status proxy helpers, +- typed exceptions, +- `EnsureProtocolSuccess`, +- `EnsureMxAccessSuccess`, +- CLI commands: version, ping, open-session, close-session, register, + add-item, advise, stream-events, write, write2, smoke, +- JSON CLI output. + +Acceptance criteria: + +- scalar and array conversions pass fixtures, +- status arrays are preserved, +- API keys are redacted, +- smoke command closes session in `finally`. + +## Milestone: clients-go + +Goal: implement the Go module, test CLI, and tests. + +### Issue: Scaffold Go Module + +Labels: `area:client-go`, `type:infra`, `priority:p0` + +Deliverables: + +- `clients/go/go.mod`, +- generated protobuf package, +- `mxgateway` package, +- `cmd/mxgw-go`, +- unit test structure. + +Acceptance criteria: + +- `go test ./...` runs, +- generated code compiles, +- module path is stable. + +### Issue: Implement Go Client, Session, Values, Errors, And CLI + +Labels: `area:client-go`, `type:feature`, `priority:p0` + +Deliverables: + +- `Dial(ctx, Options)`, +- auth interceptors, +- TLS/plaintext setup, +- `Client.OpenSession`, +- `Session` helpers, +- event channel receive loop, +- value conversion helpers, +- typed errors with `errors.As`, +- CLI commands and JSON output. + +Acceptance criteria: + +- auth metadata on unary and streams, +- context cancellation stops calls, +- event channel closes exactly once, +- raw protobuf access remains available, +- fixture conversions pass, +- CLI redacts API key. + +Tests: + +- `bufconn` fake service, +- auth metadata, +- stream cancellation, +- conversion fixtures, +- CLI parser/output. + +## Milestone: clients-rust + +Goal: implement the Rust `tonic` client crate, test CLI, and tests. + +### Issue: Scaffold Rust Workspace + +Labels: `area:client-rust`, `type:infra`, `priority:p0` + +Deliverables: + +- `clients/rust/Cargo.toml`, +- client crate, +- CLI crate, +- `build.rs` protobuf generation, +- generated module organization. + +Acceptance criteria: + +- `cargo test` runs, +- generated code compiles, +- MSVC linker works. + +### Issue: Implement Rust Client, Session, Values, Errors, And CLI + +Labels: `area:client-rust`, `type:feature`, `priority:p0` + +Deliverables: + +- `ClientOptions`, +- `GatewayClient::connect`, +- auth interceptor, +- TLS/plaintext channel, +- session helpers, +- event stream as `Stream>`, +- `thiserror` error model, +- conversion helpers, +- `clap` CLI, +- JSON output with `serde_json`. + +Acceptance criteria: + +- metadata includes bearer key, +- dropped stream cancels underlying stream, +- raw generated client remains reachable where needed, +- fixture tests pass, +- command errors keep raw reply, +- API key is redacted from debug output. + +Tests: + +- fake tonic server, +- auth tests, +- stream order/cancel, +- conversion fixtures, +- CLI parser/output. + +## Milestone: clients-python + +Goal: implement the async Python client package, test CLI, and tests. + +### Issue: Scaffold Python Package + +Labels: `area:client-python`, `type:infra`, `priority:p0` + +Deliverables: + +- `clients/python/pyproject.toml`, +- `src/mxgateway`, +- generated protobuf modules, +- `src/mxgateway_cli`, +- `tests`. + +Acceptance criteria: + +- package installs editable, +- generated stubs import, +- `pytest` runs. + +### Issue: Implement Python Async Client, Values, Errors, And CLI + +Labels: `area:client-python`, `type:feature`, `priority:p0` + +Deliverables: + +- async `GatewayClient`, +- async `Session`, +- auth metadata helper, +- TLS/plaintext setup, +- async event iterator, +- method helpers, +- value conversion helpers, +- typed exceptions, +- `click` or `typer` CLI, +- JSON output. + +Acceptance criteria: + +- API key metadata included, +- async cancellation cancels stream/call, +- raw protobuf replies available, +- fixture conversions pass, +- secrets redacted. + +Tests: + +- fake async stub tests, +- metadata tests, +- cancellation tests, +- conversion fixtures, +- CLI parser/output. + +## Milestone: clients-java + +Goal: implement Java client library, CLI, and tests. + +### Issue: Scaffold Java Gradle Build + +Labels: `area:client-java`, `type:infra`, `priority:p0` + +Deliverables: + +- `clients/java/settings.gradle`, +- `mxgateway-client` project, +- `mxgateway-cli` project, +- protobuf/gRPC Gradle generation, +- JUnit test setup. + +Acceptance criteria: + +- `gradle test` runs, +- generated code compiles, +- Java 21 toolchain used. + +### Issue: Implement Java Client, Session, Values, Errors, And CLI + +Labels: `area:client-java`, `type:feature`, `priority:p0` + +Deliverables: + +- `MxGatewayClientOptions`, +- `MxGatewayClient`, +- `MxGatewaySession`, +- auth interceptor, +- plaintext/TLS channels, +- blocking and async event stream options, +- method helpers, +- value conversion helpers, +- typed exceptions, +- `picocli` CLI, +- JSON output. + +Acceptance criteria: + +- unary and streaming calls carry auth metadata, +- deadlines are applied, +- stream cancellation works, +- raw generated messages are accessible, +- fixture tests pass, +- CLI redacts secrets. + +Tests: + +- in-process gRPC tests, +- auth interceptor, +- stream cancellation, +- conversion fixtures, +- CLI parser/output. + +## Milestone: integration-and-parity + +Goal: prove clients can talk to the gateway consistently. + +### Issue: Cross-Language Smoke Test Matrix + +Labels: `area:tests`, `type:test`, `priority:p1` + +Deliverables: + +- common smoke script or documented commands, +- each client runs open/register/add/advise/stream/close, +- JSON output comparison, +- optional write test. + +Acceptance criteria: + +- each client has equivalent smoke behavior, +- each client skips integration unless `MXGATEWAY_INTEGRATION=1`, +- failed smoke output includes endpoint, language, and redacted auth context. + +### Issue: Client Packaging Documentation + +Labels: `area:docs`, `type:docs`, `priority:p2` + +Deliverables: + +- install instructions per client, +- generation instructions, +- CLI usage examples, +- TLS/API key examples, +- integration test instructions. + +Acceptance criteria: + +- new developer can build each client from a clean checkout using + `docs/toolchain-links.md`, +- generated code command is documented for every language. + diff --git a/docs/implementation-plan-gateway.md b/docs/implementation-plan-gateway.md new file mode 100644 index 0000000..225dfdc --- /dev/null +++ b/docs/implementation-plan-gateway.md @@ -0,0 +1,511 @@ +# Gateway Implementation Plan + +This plan implements the .NET 10 gateway process first. It covers contracts, +configuration, API-key authentication, worker lifecycle, gRPC APIs, event +streaming, metrics, dashboard, tests, and operational hooks. + +Primary designs: + +- `docs/gateway-process-design.md` +- `docs/gateway-dashboard-design.md` +- `docs/design-decisions.md` +- `docs/toolchain-links.md` + +## Milestone: gateway-foundation + +Goal: create the solution, shared contracts, configuration model, logging, and +test scaffolding that all later work depends on. + +### Issue: Scaffold Gateway Solution And Projects + +Labels: `area:gateway`, `type:infra`, `priority:p0` + +Deliverables: + +- create `src/MxGateway.sln`, +- create `src/MxGateway.Contracts`, +- create `src/MxGateway.Server`, +- create `src/MxGateway.Tests`, +- create `src/MxGateway.IntegrationTests`, +- target `MxGateway.Server` to `net10.0`, +- add shared C# build settings in `Directory.Build.props`, +- add baseline tests. + +Acceptance criteria: + +- `dotnet build src/MxGateway.sln` succeeds, +- `dotnet test src/MxGateway.sln` succeeds, +- gateway project does not reference MXAccess COM. + +### Issue: Define Protobuf Contracts + +Labels: `area:contracts`, `type:feature`, `priority:p0` + +Deliverables: + +- `src/MxGateway.Contracts/Protos/mxaccess_gateway.proto`, +- `src/MxGateway.Contracts/Protos/mxaccess_worker.proto`, +- `MxAccessGateway` service with `OpenSession`, `CloseSession`, `Invoke`, and + `StreamEvents`, +- `WorkerEnvelope` and worker IPC messages, +- `MxValue`, `MxArray`, `MxStatusProxy`, `MxEvent`, and first-slice command + payloads, +- generated C# code. + +Acceptance criteria: + +- generated code builds, +- worker envelopes include protocol version, session id, sequence, and + correlation id, +- command replies preserve protocol status, HRESULT, return value, out params, + and status arrays. + +Tests: + +- protobuf generation smoke, +- serialization round-trip for command, reply, event, value, and status. + +### Issue: Add Gateway Configuration And Validation + +Labels: `area:gateway`, `type:feature`, `priority:p0` + +Deliverables: + +- typed options for authentication, worker, sessions, events, dashboard, and + protocol, +- startup validation, +- defaults matching design docs, +- redacted effective-configuration model. + +Acceptance criteria: + +- invalid worker path, invalid queue capacity, invalid auth config, and invalid + dashboard config fail startup clearly, +- redacted config never includes API key pepper or raw secrets. + +Tests: + +- options binding, +- validation, +- redaction. + +### Issue: Add Structured Logging And Metrics Foundation + +Labels: `area:gateway`, `type:infra`, `priority:p0` + +Deliverables: + +- logging scopes for session id, worker process id, correlation id, command + method, and client identity, +- counters/gauges/histograms for sessions, workers, commands, events, queues, + and faults, +- redaction helpers. + +Acceptance criteria: + +- common logs include correlation fields, +- API keys and credential-bearing values are not logged, +- metrics can feed dashboard snapshots. + +Tests: + +- log redaction, +- metric update tests. + +## Milestone: gateway-auth + +Goal: implement API-key authentication backed by SQLite. + +### Issue: Implement SQLite Auth Store And Migrations + +Labels: `area:auth`, `type:feature`, `priority:p0` + +Deliverables: + +- SQLite schema for `schema_version`, `api_keys`, and `api_key_audit`, +- idempotent startup migrations, +- newer-schema startup block, +- key lookup and audit services. + +Acceptance criteria: + +- empty DB initializes, +- existing DB migrates, +- newer DB version blocks startup, +- revoked keys cannot authenticate. + +Tests: + +- temp SQLite migration tests, +- key lookup tests, +- revoked key tests. + +### Issue: Implement API Key Hashing And Verification + +Labels: `area:auth`, `type:feature`, `priority:p0` + +Deliverables: + +- parse `mxgw__` format, +- HMAC-SHA256 with gateway-local pepper or accepted Argon2id dependency, +- constant-time hash comparison, +- key id/display name/scopes identity model. + +Acceptance criteria: + +- raw secrets are never stored, +- malformed keys fail unauthenticated, +- valid keys authenticate, +- revoked keys fail. + +Tests: + +- parse tests, +- hash verification, +- redaction, +- scope extraction. + +### Issue: Implement Local API Key Admin CLI + +Labels: `area:auth`, `type:feature`, `priority:p1` + +Deliverables: + +- local admin CLI or gateway subcommand, +- `init-db`, +- `create-key`, +- `list-keys`, +- `revoke-key`, +- `rotate-key`, +- JSON output option. + +Acceptance criteria: + +- created key can authenticate, +- listed keys never show raw secret, +- revoked key fails authentication, +- raw secret is printed exactly once on create/rotate. + +Tests: + +- CLI parser, +- temp DB command tests, +- JSON redaction. + +### Issue: Add gRPC Authentication And Scope Authorization + +Labels: `area:auth`, `area:gateway`, `type:feature`, `priority:p0` + +Deliverables: + +- gRPC auth middleware/interceptor, +- request identity context, +- scope checks for sessions, invoke, secure invoke, events, metadata, and + admin actions. + +Acceptance criteria: + +- missing/invalid key returns unauthenticated, +- valid key with missing scope returns permission denied, +- auth applies to unary and streaming calls. + +Tests: + +- unary auth, +- streaming auth, +- scope mapping. + +## Milestone: gateway-sessions-ipc + +Goal: create, supervise, and communicate with per-session workers. + +### Issue: Implement Worker Frame Protocol + +Labels: `area:gateway`, `area:contracts`, `type:feature`, `priority:p0` + +Deliverables: + +- little-endian uint32 length-prefixed frame reader/writer, +- max message size enforcement, +- protobuf envelope validation, +- protocol violation errors. + +Acceptance criteria: + +- valid frames round-trip, +- partial reads are handled, +- oversized frames fail before allocation, +- wrong protocol/session id is detected. + +Tests: + +- round-trip, +- partial read, +- malformed length, +- max size, +- wrong protocol/session. + +### Issue: Implement Worker Process Launcher + +Labels: `area:gateway`, `type:feature`, `priority:p0` + +Deliverables: + +- worker executable validation, +- process launch with session id, pipe name, protocol version, +- nonce via environment, +- startup timeout handling, +- failed-startup cleanup. + +Acceptance criteria: + +- command line contains no secrets, +- nonce is not logged, +- failed startup kills worker and disposes pipe, +- process id is recorded. + +Tests: + +- fake worker success/failure, +- timeout kill, +- command-line redaction. + +### Issue: Implement Gateway WorkerClient + +Labels: `area:gateway`, `type:feature`, `priority:p0` + +Deliverables: + +- named-pipe server, +- `GatewayHello`/`WorkerHello` handshake, +- read loop, +- write loop, +- pending command dictionary, +- event channel, +- heartbeat tracking, +- terminal fault handling. + +Acceptance criteria: + +- worker ready establishes `Ready` state, +- command reply completes matching pending command, +- worker events enter channel in order, +- pipe disconnect faults session. + +Tests: + +- fake worker protocol, +- command correlation, +- late reply, +- pipe disconnect, +- heartbeat expiration. + +### Issue: Implement Session Manager And Registry + +Labels: `area:gateway`, `type:feature`, `priority:p0` + +Deliverables: + +- session state machine, +- registry keyed by session id, +- `OpenSession` orchestration, +- `CloseSession` idempotency, +- lease hooks, +- gateway shutdown cleanup. + +Acceptance criteria: + +- only `Ready` sessions accept commands, +- close is idempotent, +- faulted sessions reject new commands, +- shutdown terminates workers. + +Tests: + +- state transitions, +- close idempotency, +- open failure cleanup, +- shutdown cleanup. + +## Milestone: gateway-grpc-events-dashboard + +Goal: expose the public API, stream events, and provide the dashboard. + +### Issue: Implement Public gRPC Service + +Labels: `area:gateway`, `type:feature`, `priority:p0` + +Deliverables: + +- `MxAccessGatewayService`, +- `OpenSession`, +- `CloseSession`, +- `Invoke`, +- `StreamEvents`, +- request validation, +- public-to-worker mappers. + +Acceptance criteria: + +- missing session fails clearly, +- method-specific payloads map correctly, +- HRESULT/status survives in replies, +- transport errors are separate from command replies. + +Tests: + +- service unit tests, +- mapper tests, +- validation tests, +- reply/error mapping. + +### Issue: Implement Event Streaming And Backpressure + +Labels: `area:gateway`, `type:feature`, `priority:p0` + +Deliverables: + +- one active subscriber per session, +- second-subscriber rejection, +- ordered event streaming, +- fail-fast queue overflow, +- terminal fault propagation, +- event-rate metrics. + +Acceptance criteria: + +- event order preserved, +- stream cancellation detaches subscriber, +- queue overflow faults session, +- `OperationComplete` is not synthesized by gateway. + +Tests: + +- order, +- single-subscriber enforcement, +- cancellation, +- overflow. + +### Issue: Implement Dashboard Snapshot Service + +Labels: `area:dashboard`, `type:feature`, `priority:p1` + +Deliverables: + +- immutable dashboard snapshot DTOs, +- session summaries, +- worker summaries, +- metric summaries, +- fault summaries, +- `WatchSnapshotsAsync`. + +Acceptance criteria: + +- snapshot reads do not mutate session/worker state, +- secrets and credential values are redacted, +- subscribers dispose cleanly. + +Tests: + +- projection, +- redaction, +- subscription disposal, +- empty/active/faulted states. + +### Issue: Implement Blazor Server Dashboard + +Labels: `area:dashboard`, `type:feature`, `priority:p1` + +Deliverables: + +- Blazor Server hosting, +- Bootstrap CSS/JS assets, +- layout/nav, +- home page, +- sessions page, +- workers page, +- events page, +- settings page, +- real-time refresh. + +Acceptance criteria: + +- Bootstrap/local CSS only, +- no MudBlazor or other Blazor UI libraries, +- pages update without manual refresh, +- dashboard can be disabled by config. + +Tests: + +- snapshot service tests, +- component tests if bUnit is added, +- disabled-dashboard behavior. + +### Issue: Implement Dashboard Authentication + +Labels: `area:dashboard`, `area:auth`, `type:feature`, `priority:p1` + +Deliverables: + +- `/dashboard/login`, +- API-key validation with `admin` scope, +- HTTP-only secure cookie, +- logout, +- anti-forgery protection, +- optional explicit anonymous-localhost dev mode defaulting false. + +Acceptance criteria: + +- unauthenticated access is denied/redirected, +- non-admin key is denied, +- admin key logs in, +- cookies use secure settings, +- API keys never appear in query strings or logs. + +Tests: + +- auth decisions, +- non-admin denial, +- cookie properties, +- redaction. + +## Milestone: integration-and-parity + +Goal: prove gateway behavior with fake workers before depending on live +MXAccess. + +### Issue: Build Fake Worker Test Harness + +Labels: `area:tests`, `area:gateway`, `type:test`, `priority:p0` + +Deliverables: + +- fake worker executable or in-process transport, +- scripted hello/ready/reply/event/fault behavior, +- malformed protocol scenarios, +- slow/hung worker scenarios. + +Acceptance criteria: + +- gateway tests do not require installed MXAccess, +- fake worker simulates startup success/failure, +- fake worker emits ordered events and faults. + +### Issue: Gateway End-To-End Smoke With Fake Worker + +Labels: `area:tests`, `area:gateway`, `type:test`, `priority:p0` + +Deliverables: + +- open session, +- invoke `Register`, `AddItem`, `Advise`, +- stream one event, +- close session, +- verify metrics/dashboard snapshot changed. + +Acceptance criteria: + +- smoke passes without live MXAccess, +- worker exits, +- artifacts stay in temp directories. + diff --git a/docs/implementation-plan-index.md b/docs/implementation-plan-index.md new file mode 100644 index 0000000..58c88d3 --- /dev/null +++ b/docs/implementation-plan-index.md @@ -0,0 +1,100 @@ +# Implementation Plan Index + +This index defines the implementation order and a Gitea issue/milestone model +for tracking the work. + +Repository: + +```text +https://gitea.dohertylan.com/dohertj2/mxaccessgw +``` + +Implementation order: + +1. Gateway process +2. MXAccess worker instance +3. Client libraries + +Detailed plans: + +- `docs/implementation-plan-gateway.md` +- `docs/implementation-plan-mxaccess-worker.md` +- `docs/implementation-plan-clients.md` + +## Gitea Milestones + +Recommended milestones: + +1. `gateway-foundation` +2. `gateway-auth` +3. `gateway-sessions-ipc` +4. `gateway-grpc-events-dashboard` +5. `mxaccess-worker-foundation` +6. `mxaccess-worker-parity-slice` +7. `clients-dotnet` +8. `clients-go` +9. `clients-rust` +10. `clients-python` +11. `clients-java` +12. `integration-and-parity` +13. `packaging-and-ops` + +## Gitea Labels + +Recommended labels: + +- `area:contracts` +- `area:gateway` +- `area:worker` +- `area:dashboard` +- `area:auth` +- `area:client-dotnet` +- `area:client-go` +- `area:client-rust` +- `area:client-python` +- `area:client-java` +- `area:tests` +- `area:docs` +- `type:feature` +- `type:test` +- `type:infra` +- `type:docs` +- `priority:p0` +- `priority:p1` +- `priority:p2` +- `blocked` + +## Issue Body Template + +```markdown +## Context + +## Deliverables + +## Acceptance Criteria + +## Tests + +## Dependencies +``` + +## Definition Of Done + +Every implementation issue should meet this baseline: + +- follows the relevant style guide in `docs/style-guides/`, +- generated code is reproducible, +- secrets are not logged, +- unit tests pass, +- docs are updated when behavior, commands, or paths change, +- live MXAccess verification steps are documented when required. + +## Toolchain + +Use `docs/toolchain-links.md` for installed compiler/runtime paths. If a new +terminal cannot find a recently installed tool, refresh PATH: + +```powershell +$env:Path = [Environment]::GetEnvironmentVariable('Path','Machine') + ';' + [Environment]::GetEnvironmentVariable('Path','User') +``` + diff --git a/docs/implementation-plan-mxaccess-worker.md b/docs/implementation-plan-mxaccess-worker.md new file mode 100644 index 0000000..96c1ded --- /dev/null +++ b/docs/implementation-plan-mxaccess-worker.md @@ -0,0 +1,450 @@ +# MXAccess Worker Implementation Plan + +This plan implements the .NET Framework 4.8 x86 worker process after the +gateway foundation exists. The worker owns MXAccess COM, the dedicated STA +thread, message pumping, command dispatch, event sinks, conversion, heartbeat, +and shutdown. + +Primary designs: + +- `docs/mxaccess-worker-instance-design.md` +- `docs/design-decisions.md` +- `docs/toolchain-links.md` +- `C:\Users\dohertj2\Desktop\mxaccess\docs\MXAccess-Public-API.md` + +## Milestone: mxaccess-worker-foundation + +Goal: create the worker executable, connect to the gateway pipe, and report +ready from a functioning STA runtime. + +### Issue: Scaffold Worker Project + +Labels: `area:worker`, `type:infra`, `priority:p0` + +Deliverables: + +- create `src/MxGateway.Worker`, +- target `.NET Framework 4.8`, +- platform target `x86`, +- reference generated worker contracts, +- reference `ArchestrA.MXAccess.dll`, +- create `src/MxGateway.Worker.Tests`, +- document MSBuild command from `docs/toolchain-links.md`. + +Acceptance criteria: + +- worker builds as x86, +- worker tests run, +- MXAccess interop reference exists only inside worker boundary, +- gateway project does not reference MXAccess. + +Tests: + +- worker build, +- worker test project compile. + +### Issue: Implement Worker Bootstrap And Options + +Labels: `area:worker`, `type:feature`, `priority:p0` + +Deliverables: + +- parse `--session-id`, +- parse `--pipe-name`, +- parse `--protocol-version`, +- read `MXGATEWAY_WORKER_NONCE`, +- configure minimal structured logging, +- redact nonce and secrets. + +Acceptance criteria: + +- missing required arguments fail fast, +- invalid protocol version fails fast, +- nonce is never logged, +- bootstrap returns structured exit codes. + +Tests: + +- parser tests, +- missing/invalid values, +- redaction. + +### Issue: Implement Pipe Client And Frame Protocol + +Labels: `area:worker`, `area:contracts`, `type:feature`, `priority:p0` + +Deliverables: + +- connect to named pipe, +- frame reader/writer, +- envelope validation, +- `WorkerHello`, +- `GatewayHello` validation, +- `WorkerReady`, +- `WorkerFault`. + +Acceptance criteria: + +- session id, protocol, and nonce are validated before MXAccess creation, +- protocol mismatch fails session, +- malformed frames fault worker, +- all pipe writes go through one writer. + +Tests: + +- frame round-trip, +- wrong session/protocol/nonce, +- malformed frame, +- writer serialization. + +### Issue: Implement STA Runtime And Message Pump + +Labels: `area:worker`, `type:feature`, `priority:p0` + +Deliverables: + +- dedicated STA thread, +- COM initialization on STA, +- command queue wake event, +- `MsgWaitForMultipleObjectsEx` loop, +- `PeekMessage`/`TranslateMessage`/`DispatchMessage`, +- last STA activity timestamp, +- clean thread shutdown. + +Acceptance criteria: + +- commands execute on STA thread, +- pump continues while idle, +- shutdown exits thread, +- watchdog sees STA activity. + +Tests: + +- fake command executes on STA, +- queue wake, +- shutdown, +- watchdog timestamp. + +### Issue: Create MXAccess COM Object On STA + +Labels: `area:worker`, `type:feature`, `priority:p0` + +Deliverables: + +- instantiate `ArchestrA.MxAccess.LMXProxyServerClass`, +- record CLSID/ProgID/interoperability info, +- attach base event handlers, +- send `WorkerReady` only after COM creation succeeds, +- structured fault on COM creation failure. + +Acceptance criteria: + +- COM creation happens on STA only, +- gateway receives ready with worker info, +- failure includes HRESULT/exception where available, +- raw COM object never crosses threads. + +Tests: + +- fake COM factory tests, +- COM creation failure mapping, +- worker info mapping. + +Live tests: + +- opt-in live COM creation on installed MXAccess machine. + +## Milestone: mxaccess-worker-parity-slice + +Goal: implement first end-to-end command/event slice through real MXAccess. + +### Issue: Implement STA Command Dispatcher + +Labels: `area:worker`, `type:feature`, `priority:p0` + +Deliverables: + +- `StaCommand` model, +- command queue, +- one-at-a-time execution, +- command reply creation, +- cancellation before command starts, +- late-reply behavior after gateway timeout/cancel. + +Acceptance criteria: + +- command order is preserved, +- exceptions convert to command replies, +- current command correlation appears in heartbeat, +- shutdown rejects new commands. + +Tests: + +- order, +- exception mapping, +- cancellation-before-start, +- shutdown rejection. + +### Issue: Implement Register And Unregister + +Labels: `area:worker`, `type:feature`, `priority:p0` + +Deliverables: + +- `Register`, +- `Unregister`, +- server handle tracking, +- HRESULT/exception capture. + +Acceptance criteria: + +- returned MXAccess server handle is preserved, +- invalid unregister behavior is preserved, +- registry state is updated for diagnostics and cleanup only. + +Tests: + +- fake MXAccess register/unregister, +- invalid handle mapping, +- registry updates. + +Live tests: + +- real `Register`/`Unregister`. + +### Issue: Implement AddItem, AddItem2, RemoveItem + +Labels: `area:worker`, `type:feature`, `priority:p0` + +Deliverables: + +- `AddItem`, +- `AddItem2`, +- `RemoveItem`, +- item handle tracking, +- context string preservation, +- invalid/cross-server handle behavior preservation. + +Acceptance criteria: + +- returned item handles are not rewritten, +- context is passed exactly to MXAccess, +- invalid handles preserve HRESULT/status/exception shape. + +Tests: + +- fake item lifecycle, +- context mapping, +- invalid/cross-server cases. + +Live tests: + +- real `AddItem`, +- real `AddItem2("TestInt", "TestChildObject")`. + +### Issue: Implement Advise, UnAdvise, AdviseSupervisory + +Labels: `area:worker`, `type:feature`, `priority:p0` + +Deliverables: + +- advise command handlers, +- advise state tracking, +- plain and supervisory methods, +- unadvise cleanup. + +Acceptance criteria: + +- calls execute on STA, +- advise state is tracked for cleanup, +- plain and supervisory methods remain distinct commands. + +Tests: + +- fake advise/unadvise, +- cleanup state, +- invalid handle mapping. + +Live tests: + +- advise known tag and observe first event where provider state allows. + +### Issue: Implement Event Sink And Event Queue + +Labels: `area:worker`, `type:feature`, `priority:p0` + +Deliverables: + +- handlers for `OnDataChange`, +- handlers for `OnWriteComplete`, +- handlers for `OperationComplete`, +- handlers for `OnBufferedDataChange`, +- monotonic worker event sequence, +- bounded outbound event queue, +- fail-fast overflow. + +Acceptance criteria: + +- events are enqueued, not pipe-written on STA, +- order is preserved, +- `OperationComplete` is not synthesized, +- buffered events preserve raw metadata if conversion is incomplete, +- overflow faults session. + +Tests: + +- fake event conversion, +- ordering, +- overflow, +- no synthetic operation complete. + +Live tests: + +- real `OnDataChange` and `OnWriteComplete` where provider emits them. + +### Issue: Implement Value Conversion + +Labels: `area:worker`, `type:feature`, `priority:p0` + +Deliverables: + +- scalar `VARIANT` conversion, +- SAFEARRAY conversion, +- raw fallback metadata, +- timestamp conversion, +- array rank/dimension metadata where available. + +Acceptance criteria: + +- bool/int/float/double/string/time conversions work, +- arrays convert for supported types, +- unknown values keep raw metadata, +- credential-bearing values are not logged. + +Tests: + +- scalar conversion matrix, +- array conversion matrix, +- null/empty cases, +- raw fallback. + +### Issue: Implement MXSTATUS_PROXY And HRESULT Conversion + +Labels: `area:worker`, `type:feature`, `priority:p0` + +Deliverables: + +- `MXSTATUS_PROXY[]` conversion, +- category/source/detail preservation, +- success field preservation, +- HRESULT extraction from COM exceptions, +- safe diagnostic messages. + +Acceptance criteria: + +- status arrays are not collapsed, +- raw fields preserved, +- exception HRESULT is captured, +- completion-only status bytes are raw unless exact mapping is proven. + +Tests: + +- status struct conversion, +- exception/HRESULT mapping, +- raw fallback metadata. + +### Issue: Implement Heartbeat And Watchdog + +Labels: `area:worker`, `type:feature`, `priority:p1` + +Deliverables: + +- periodic heartbeat messages, +- last STA activity, +- pending command count, +- current command correlation id, +- event queue depth, +- watchdog warnings. + +Acceptance criteria: + +- gateway receives updates, +- stuck command is visible in heartbeat, +- high queue/stale activity warnings are observable. + +Tests: + +- heartbeat payload, +- stale activity, +- queue depth. + +### Issue: Implement Graceful Shutdown + +Labels: `area:worker`, `type:feature`, `priority:p0` + +Deliverables: + +- handle `WorkerShutdown`, +- reject new commands, +- let current command finish within timeout, +- best-effort `UnAdvise`, `RemoveItem`, `Unregister`, +- detach event handlers, +- release COM object, +- exit process. + +Acceptance criteria: + +- cleanup order follows design, +- cleanup failures are logged but do not hang shutdown, +- gateway can kill after timeout, +- worker exits with success on graceful shutdown. + +Tests: + +- fake cleanup order, +- cleanup failure, +- command in progress, +- shutdown timeout. + +## Milestone: integration-and-parity + +Goal: prove gateway plus worker behavior against installed MXAccess. + +### Issue: Worker Live MXAccess Smoke Test + +Labels: `area:worker`, `area:tests`, `type:test`, `priority:p0` + +Deliverables: + +- opt-in live test harness, +- open gateway session, +- spawn worker, +- create MXAccess COM, +- `Register`, +- `AddItem`, +- `Advise`, +- wait bounded time for data/status, +- `CloseSession`. + +Acceptance criteria: + +- test skips without explicit environment variable, +- test cleans up worker even on failure, +- logs include enough data for parity debugging. + +### Issue: Parity Fixture Matrix + +Labels: `area:worker`, `area:tests`, `type:test`, `priority:p1` + +Deliverables: + +- fixture list based on `C:\Users\dohertj2\Desktop\mxaccess\captures`, +- scenarios for invalid handles, write statuses, secured writes, add-item + context, and buffered registration, +- comparison format for direct MXAccess vs gateway. + +Acceptance criteria: + +- each public method has planned parity fixture or documented gap, +- gateway results preserve HRESULT/status/value/event shape. + diff --git a/docs/mxaccess-worker-instance-design.md b/docs/mxaccess-worker-instance-design.md new file mode 100644 index 0000000..74fe415 --- /dev/null +++ b/docs/mxaccess-worker-instance-design.md @@ -0,0 +1,636 @@ +# MXAccess Worker Instance Detailed Design + +## Purpose + +An MXAccess worker instance is the compatibility boundary around one installed +MXAccess COM object. It runs as a disposable .NET Framework 4.8 x86 process, +owns one dedicated STA thread, pumps Windows/COM messages, executes MXAccess +commands on that STA, and forwards MXAccess events back to the gateway. + +The worker's job is not to make MXAccess nicer. Its job is to preserve direct +MXAccess behavior while making that behavior available to modern clients through +the gateway. + +## Runtime + +- Target runtime: .NET Framework 4.8. +- Language: C#. +- Platform target: x86 by default. +- Process lifetime: one worker per gateway session. +- Public network listeners: none. +- Gateway IPC: one named pipe with protobuf-framed messages. +- COM apartment: one dedicated STA thread. + +Style guides: + +- [C# Style Guide](./style-guides/CSharpStyleGuide.md) +- [Protobuf Style Guide](./style-guides/ProtobufStyleGuide.md) + +## Responsibilities + +The worker owns: + +- connection to the gateway pipe, +- protocol hello and readiness reporting, +- STA thread creation and teardown, +- COM initialization on the STA, +- MXAccess COM object creation, +- MXAccess event sink wiring, +- command dispatch on the STA, +- MXAccess handle and advise state tracking, +- value/status/HRESULT capture, +- conversion to worker protobuf DTOs, +- event sequencing, +- heartbeat reporting, +- graceful shutdown. + +The worker does not own: + +- public gRPC API, +- client authentication, +- cross-session routing, +- worker process supervision, +- remote TLS, +- policy decisions for other sessions. + +## Process Bootstrap + +Expected command-line arguments: + +```text +--session-id +--pipe-name +--protocol-version +``` + +Expected protected environment values: + +```text +MXGATEWAY_WORKER_NONCE= +MXGATEWAY_WORKER_LOG_CONTEXT= +``` + +Startup sequence: + +1. Parse command-line arguments. +2. Configure minimal logging. +3. Validate required values are present. +4. Connect to the gateway named pipe. +5. Exchange `WorkerHello` and `GatewayHello`. +6. Validate protocol version, session id, and nonce. +7. Start the STA runtime. +8. Create the MXAccess COM object on the STA. +9. Attach MXAccess event handlers on the STA. +10. Send `WorkerReady`. +11. Start pipe read, pipe write, heartbeat, and shutdown coordination loops. + +If validation fails before MXAccess creation, exit quickly with a non-zero exit +code. If MXAccess creation fails, send `WorkerFault` when possible and exit. + +## Internal Components + +```text +MxGateway.Worker + Program + Bootstrap + WorkerOptions + WorkerHost + Ipc + PipeClient + FrameReader + FrameWriter + WorkerProtocol + Sta + StaRuntime + StaCommandQueue + MessagePump + StaWatchdog + MxAccess + MxAccessSession + MxAccessCommandDispatcher + MxAccessEventSink + MxAccessHandleRegistry + Conversion + VariantConverter + SafeArrayConverter + StatusProxyConverter + HResultMapper +``` + +## Threading Model + +```text +main thread + -> parse args + -> configure host + -> coordinate shutdown + +pipe reader thread/task + -> read WorkerEnvelope frames + -> validate protocol + -> enqueue commands or control messages + +pipe writer thread/task + -> serialize WorkerEnvelope frames + -> write replies, events, heartbeats, faults + +STA thread + -> CoInitializeEx(APARTMENTTHREADED) + -> create MXAccess COM object + -> attach event handlers + -> pump Windows/COM messages + -> execute queued commands + -> detach events and release COM on shutdown + +watchdog/heartbeat task + -> observe STA responsiveness + -> send heartbeat or fault +``` + +No MXAccess method may execute outside the STA thread. Do not use `Task.Run` +around COM calls. Do not let event handlers perform pipe writes. + +## STA Runtime + +The STA runtime is the most important part of the worker. + +Startup: + +1. Create a dedicated `Thread`. +2. Set apartment state to `ApartmentState.STA`. +3. Start the thread. +4. Inside the thread, initialize COM. +5. Create the MXAccess COM object. +6. Attach event handlers. +7. Signal ready to the worker host. +8. Enter the message pump. + +Shutdown: + +1. Mark the command queue as completing. +2. Drain or reject pending commands according to shutdown mode. +3. Optionally issue MXAccess cleanup calls for active handles. +4. Detach event handlers. +5. Release COM references. +6. Uninitialize COM. +7. Exit the thread. + +## Message Pump + +The STA must pump Windows messages while also processing queued commands. A +blocking queue that prevents message pumping is not acceptable. + +Required loop shape: + +```text +while not shutdown: + while command queue has work: + execute one command on STA + + MsgWaitForMultipleObjectsEx( + command_event, + timeout, + QS_ALLINPUT, + MWMO_INPUTAVAILABLE) + + while PeekMessage: + TranslateMessage + DispatchMessage +``` + +The command queue should signal a Win32 event or equivalent wait handle so the +STA can wake without busy-waiting. + +The loop should update a heartbeat timestamp after: + +- successfully pumping messages, +- starting a command, +- finishing a command, +- processing an MXAccess event. + +## COM Creation + +The MXAccess analysis source at `C:\Users\dohertj2\Desktop\mxaccess` identifies +the installed COM target: + +- interop assembly: + `C:\Program Files (x86)\ArchestrA\Framework\Bin\ArchestrA.MXAccess.dll` +- assembly identity: + `ArchestrA.MxAccess, Version=3.2.0.0, PublicKeyToken=23106a86e706d0ae` +- COM class: + `ArchestrA.MxAccess.LMXProxyServerClass` +- CLSID: + `{C30B52F5-2CB5-4760-AF0A-3A344A7EB5DC}` +- ProgID: + `LMXProxy.LMXProxyServer.1` +- version-independent ProgID: + `LMXProxy.LMXProxyServer` +- registered server: + `C:\Program Files (x86)\ArchestrA\Framework\Bin\LmxProxy.dll` +- registry view: + `HKCR\Wow6432Node\CLSID\{C30B52F5-2CB5-4760-AF0A-3A344A7EB5DC}` +- threading model: + `Apartment` + +The worker should reference the interop assembly and instantiate +`LMXProxyServerClass` on the dedicated STA thread. Keep the ProgID and assembly +path configurable for diagnostics, but this COM class is the v1 default. + +Creation rules: + +- Create COM object only on the STA. +- Attach event handlers only on the STA. +- Keep the COM reference private to the STA runtime. +- Never marshal the raw COM object to pipe reader/writer threads. +- Capture COM creation HRESULT or exception details. + +If COM creation fails, the worker should send a structured fault with: + +- fault category, +- exception type, +- HRESULT when available, +- COM class or ProgID attempted, +- worker process id, +- session id. + +## Event Sink + +The worker must subscribe to every public MXAccess event family: + +- `OnDataChange` +- `OnWriteComplete` +- `OperationComplete` +- `OnBufferedDataChange` + +Forward these event families only when the native MXAccess COM object raises +them. Do not synthesize `OperationComplete` from write completion or command +status. `OnBufferedDataChange` must be represented in the protocol now, but +multi-sample payload conversion should remain capture-validated; preserve raw +metadata whenever conversion is incomplete. + +Event handling rules: + +- Event handlers are expected to run on the STA. +- Assign a monotonic worker event sequence. +- Convert event args to `WorkerEvent`. +- Include value, quality, timestamp, handles, status arrays, and raw status + details when available. +- Preserve raw event payload metadata for unsupported buffered or + completion-only shapes. +- Enqueue to the outbound event queue. +- Return quickly to preserve message pumping. + +If event conversion throws, catch it inside the event handler, enqueue a +structured `WorkerFault` or diagnostic event, and keep the worker alive only if +the fault policy allows it. + +## Command Queue + +The pipe reader converts `WorkerCommand` messages into `StaCommand` entries. + +Each entry should include: + +- correlation id, +- method name, +- method-specific request payload, +- enqueue timestamp, +- cancellation marker, +- reply completion path. + +The STA command dispatcher: + +1. Dequeues one command. +2. Checks whether shutdown has started. +3. Calls the matching MXAccess method. +4. Captures return values, out parameters, status arrays, and HRESULT. +5. Converts results to `WorkerCommandReply`. +6. Enqueues the reply to the pipe writer. + +The STA should execute one command at a time. MXAccess command ordering must be +preserved for one worker. + +## Command Dispatch Surface + +Phase 1 commands: + +- `Register` +- `Unregister` +- `AddItem` +- `RemoveItem` + +Phase 2 event commands: + +- `Advise` +- `UnAdvise` +- `AdviseSupervisory` + +Full surface: + +- `AddItem2` +- `AddBufferedItem` +- `SetBufferedUpdateInterval` +- `Suspend` +- `Activate` +- `Write` +- `Write2` +- `WriteSecured` +- `WriteSecured2` +- `AuthenticateUser` +- `ArchestrAUserToId` + +Diagnostics: + +- `Ping` +- `GetSessionState` +- `GetWorkerInfo` +- `DrainEvents` +- `ShutdownWorker` + +Implement method-specific dispatch instead of a generic string method invoker. +Parity tests need stable command-specific request and reply shapes. + +## Handle Registry + +The worker should track MXAccess state for diagnostics and cleanup, while still +treating MXAccess as the authority. + +Suggested tracked state: + +- registered server handles, +- item handles, +- item names and context, +- server handle for each item, +- advise state, +- buffered item state, +- authenticated user ids if needed, +- last command touching each handle. + +Rules: + +- Do not invent handles. +- Do not rewrite handles returned by MXAccess. +- Preserve invalid-handle behavior from MXAccess. +- Preserve cross-server handle behavior from MXAccess. +- Use registry state for cleanup and diagnostics, not semantic correction. + +## Value Conversion + +`VariantConverter` should convert COM values into the protobuf `MxValue` union. + +Supported scalar projections: + +- bool, +- int32, +- int64, +- float, +- double, +- string, +- timestamp, +- raw fallback. + +Supported arrays: + +- bool array, +- int32 array, +- float array, +- double array, +- string array, +- timestamp array, +- raw fallback. + +Rules: + +- Preserve null and empty values distinctly when MXAccess exposes a distinction. +- Preserve array rank and dimensions when available. +- Preserve original variant type metadata. +- If conversion is lossy, include the best typed value plus raw diagnostic + metadata. +- Do not throw away values just because they are awkward. + +Credential-bearing values must not be logged. + +## Status And HRESULT Capture + +`MXSTATUS_PROXY` arrays must be represented explicitly. Do not collapse status +arrays into a single success flag. + +For every command reply, capture: + +- protocol success/failure, +- method name, +- correlation id, +- COM HRESULT if available, +- thrown exception HRESULT if available, +- MXAccess return value if any, +- method-specific out parameters, +- status array, +- diagnostic message safe for logs. + +If a COM call throws, map the exception into a command reply instead of +crashing the worker, unless the exception indicates process corruption or the +configured policy says to fail the session. + +## Cancellation + +Worker cancellation is cooperative at the queue boundary. + +Rules: + +- If a `WorkerCancel` arrives before a command starts, mark the command + canceled and reply or drop according to protocol policy. +- If a command is already executing on the STA, do not attempt to abort the COM + call. +- When the COM call returns after gateway cancellation, send the reply only if + the gateway still wants late replies; otherwise log and discard. +- Hard cancellation is process kill by the gateway. + +## Outbound Queues + +The worker should use bounded outbound queues for replies, events, heartbeats, +and faults. + +Priority order when writing: + +1. faults, +2. command replies, +3. shutdown acknowledgements, +4. heartbeats, +5. events. + +Event overflow policy defaults to fail-fast for parity testing. If the event +queue fills: + +1. Capture overflow metrics. +2. Send `WorkerFault` if possible. +3. Stop accepting new commands. +4. Let the gateway close or kill the worker. + +Production coalescing may be added later, but it must be explicit and tested. +Do not drop or coalesce events in v1. + +## Heartbeat And Watchdog + +The worker heartbeat should prove that: + +- pipe writer is alive, +- worker host is alive, +- STA has recently pumped or completed work. + +Heartbeat payload should include: + +- worker process id, +- session id, +- current state, +- last STA activity timestamp, +- pending command count, +- outbound event queue depth, +- event sequence, +- current command correlation id if any. + +The STA watchdog should warn when: + +- one command exceeds its expected duration, +- the STA has not pumped messages within the heartbeat grace period, +- event queue depth remains high. + +The worker can report the problem, but the gateway owns the final kill decision. + +## Shutdown + +Graceful shutdown sequence: + +1. Pipe reader receives `WorkerShutdown`. +2. Worker host marks shutdown requested. +3. Reject new commands. +4. Let current STA command finish if within timeout. +5. Optionally run MXAccess cleanup: + - `UnAdvise`, + - `RemoveItem`, + - `Unregister`. +6. Detach event handlers. +7. Release COM object until reference count reaches zero when possible. +8. Stop pipe reader and writer. +9. Exit process with success code. + +If shutdown wedges, the gateway kills the process. The worker should be written +so process kill does not corrupt other sessions. + +## Fault Handling + +Worker fault categories: + +- `InvalidArguments` +- `GatewayAuthenticationFailed` +- `ProtocolMismatch` +- `ProtocolViolation` +- `PipeDisconnected` +- `MxAccessCreationFailed` +- `MxAccessCommandFailed` +- `MxAccessEventConversionFailed` +- `StaHung` +- `QueueOverflow` +- `ShutdownTimeout` + +Fault payload should include: + +- category, +- session id, +- correlation id when command-specific, +- command method when command-specific, +- HRESULT when available, +- exception type when available, +- safe diagnostic message. + +Do not include raw credentials or full secured-write values. + +## Security + +The worker should trust only the launching gateway after validating: + +- expected session id, +- expected protocol version, +- nonce, +- pipe identity where available. + +It should not expose any network listener. It should not accept commands from +arbitrary local processes. + +Credential-bearing commands must keep credential data out of: + +- command line, +- logs, +- metrics labels, +- exception messages, +- crash dumps when avoidable. + +## Observability + +Worker logs should include: + +- startup arguments except secrets, +- protocol version, +- gateway handshake result, +- MXAccess COM creation result, +- command start/end with correlation id, +- HRESULT/status summary, +- event family and sequence, +- queue overflow, +- STA watchdog warnings, +- shutdown path. + +Metrics can be emitted through the gateway or exposed as worker heartbeat +fields. The worker does not need its own public metrics endpoint. + +## Testing Strategy + +Worker tests that do not require installed MXAccess: + +- frame reader/writer, +- protocol validation, +- command queue ordering, +- STA command scheduling with a fake COM object, +- message-pump wake behavior where practical, +- value conversion, +- status conversion, +- event conversion from fake event args, +- shutdown state transitions, +- queue overflow behavior. + +Live MXAccess tests: + +- COM creation on STA, +- `Register` and `Unregister`, +- `AddItem` and `RemoveItem`, +- `Advise` and one `OnDataChange`, +- write completion behavior, +- secured write behavior, +- buffered data-change behavior, +- invalid handle behavior. +- no synthesized `OperationComplete` when native MXAccess does not raise it. +- raw metadata preservation for buffered payloads that cannot yet be fully + converted. + +Live tests should be opt-in and clearly marked because they depend on installed +MXAccess COM and provider state. + +## Initial Implementation Slice + +The first worker slice should implement: + +1. Argument parsing and pipe connection. +2. Protocol hello and nonce validation. +3. STA thread startup. +4. COM initialization and MXAccess object creation. +5. Message pump with command wake event. +6. `WorkerReady`. +7. Shutdown command. +8. `Register`, `AddItem`, and `Advise`. +9. Event sink for one `OnDataChange`. +10. Basic value/status conversion. +11. Event model coverage for `OperationComplete` and `OnBufferedDataChange` + without synthesized events. +12. Fault reporting. + +This slice proves the worker can preserve the core MXAccess requirements: +single-process isolation, STA ownership, message pumping, command execution, +and event delivery. diff --git a/docs/style-guides/CSharpStyleGuide.md b/docs/style-guides/CSharpStyleGuide.md new file mode 100644 index 0000000..97bc8fc --- /dev/null +++ b/docs/style-guides/CSharpStyleGuide.md @@ -0,0 +1,76 @@ +# C# Style Guide + +This guide defines C# conventions for the gateway, worker, .NET client, test +CLIs, and C# tests. + +## Baseline + +- Use the latest stable C# version supported by the target runtime. +- Enable nullable reference types in new projects. +- Treat compiler warnings as actionable. Suppress only with a narrow reason. +- Prefer file-scoped namespaces. +- Prefer `sealed` classes unless inheritance is required. +- Keep public APIs explicit and small. Do not expose generated or transport + internals through handwritten abstractions unless raw access is intentional. + +## Source Documentation + +- Maintain the existing documentation style in the file, project, and + surrounding component. +- Write comments that include business-specific or domain-specific context when + that context is available from the code, surrounding docs, or naming. +- Prefer XML documentation on public APIs when the behavior is not obvious from + the signature. +- Avoid comments that restate syntax or control flow. + +## Naming + +- Use PascalCase for public types, methods, properties, events, and enum + members. +- Use camelCase for local variables, parameters, and private fields. +- Prefix private fields with `_` only when that pattern is already established + in the project. +- Use `Async` suffixes for methods that return `Task`, `Task`, + `ValueTask`, or `ValueTask`. +- Keep names aligned with MXAccess terms: `MxStatusProxy`, `ServerHandle`, + `ItemHandle`, `HResult`, and event family names should match the contract. + +## Async And Cancellation + +- Accept `CancellationToken` on public async methods that perform I/O or wait. +- Pass cancellation tokens through to called APIs. +- Do not use `Task.Run` to hide blocking COM calls. MXAccess calls belong on + the worker STA. +- Use `ConfigureAwait(false)` in reusable libraries. It is optional in ASP.NET + Core request handling where no synchronization context exists. +- Dispose async resources with `await using` when the type implements + `IAsyncDisposable`. + +## Errors + +- Preserve protocol, gateway, worker, COM HRESULT, and MXAccess status details. +- Use typed exceptions at API boundaries, but prefer result DTOs when callers + need method-specific MXAccess output. +- Do not log API keys, passwords, secured write values, or full tag values by + default. +- Include correlation id and session id in diagnostics when available. + +## Protobuf And Generated Code + +- Do not hand-edit generated protobuf or gRPC files. +- Keep generated code in a clearly named `Generated` namespace or directory. +- Keep mapping code outside gRPC handlers so it can be unit tested. + +## Formatting + +- Run `dotnet format` when a solution or project is available. +- Use four spaces for indentation. +- Keep one public type per file unless a small nested type is clearer. +- Avoid region-heavy files. Split large responsibilities into focused types. + +## Tests + +- Use test names that describe behavior, condition, and result. +- Prefer fake workers, fake transports, or fake gRPC services over live + MXAccess in unit tests. +- Mark live MXAccess tests as opt-in integration tests. diff --git a/docs/style-guides/GoStyleGuide.md b/docs/style-guides/GoStyleGuide.md new file mode 100644 index 0000000..75c2d94 --- /dev/null +++ b/docs/style-guides/GoStyleGuide.md @@ -0,0 +1,68 @@ +# Go Style Guide + +This guide defines Go conventions for the MXAccess Gateway Go client module, +test CLI, and tests. + +## Baseline + +- Use idiomatic Go and keep package APIs small. +- Run `gofmt` on every changed Go file. +- Run `go vet` for non-trivial changes when the module is available. +- Keep generated protobuf code under `internal/generated` unless the public API + intentionally exposes it. + +## Source Documentation + +- Maintain the existing documentation style in the file, package, and + surrounding component. +- Write comments that include business-specific or domain-specific context when + that context is available from the code, surrounding docs, or naming. +- Document exported names when they are part of the public client API. +- Avoid comments that restate syntax or control flow. + +## Packages + +- Use short, lowercase package names without underscores. +- Keep the reusable library separate from CLI code. +- Keep generated code separate from handwritten wrappers. +- Prefer internal packages for implementation details that callers should not + import. + +## Naming + +- Use exported names only for public API. +- Use initialisms consistently: `APIKey`, `TLS`, `HTTP`, `ID`. +- Keep MXAccess terms explicit: `ServerHandle`, `ItemHandle`, `MxStatusProxy`, + and `HResult`. +- Avoid generic helper names such as `Do` or `Process` for command-specific + MXAccess behavior. + +## Context And Cancellation + +- Accept `context.Context` as the first parameter for operations that can block. +- Do not store contexts in structs. +- Respect context cancellation, but document that canceling a client call does + not abort an in-flight worker COM call. +- Close streams and connections deterministically. + +## Errors + +- Return errors instead of panicking. +- Wrap errors with useful context using `%w`. +- Support `errors.Is` and `errors.As` for typed gateway, command, and MXAccess + errors. +- Preserve raw command replies on command errors when available. +- Redact API keys and credential-bearing values in error messages. + +## Concurrency + +- Avoid unbounded goroutines and unbounded channels. +- Close channels exactly once from the sending side. +- Propagate stream errors through explicit result types such as `EventResult`. +- Use the race detector for concurrency-heavy changes when practical. + +## Tests + +- Use table-driven tests for conversion and error mapping. +- Use `bufconn` or fake generated clients for unit tests. +- Keep integration tests behind `MXGATEWAY_INTEGRATION=1` or build tags. diff --git a/docs/style-guides/JavaStyleGuide.md b/docs/style-guides/JavaStyleGuide.md new file mode 100644 index 0000000..b46036e --- /dev/null +++ b/docs/style-guides/JavaStyleGuide.md @@ -0,0 +1,65 @@ +# Java Style Guide + +This guide defines Java conventions for the MXAccess Gateway Java client +library, CLI, and tests. + +## Baseline + +- Target the Java version defined by the client build, with Java 21 preferred. +- Use Gradle unless the repository standardizes on Maven. +- Apply a formatter such as Spotless or Google Java Format when configured. +- Keep generated protobuf code separate from handwritten wrappers. + +## Source Documentation + +- Maintain the existing documentation style in the file, package, and + surrounding component. +- Write comments that include business-specific or domain-specific context when + that context is available from the code, surrounding docs, or naming. +- Use Javadoc for public APIs when behavior, parity constraints, or security + requirements are not obvious from the signature. +- Avoid comments that restate syntax or control flow. + +## Packages + +- Use lowercase package names under `com.dohertylan.mxgateway`. +- Keep client library code separate from CLI code. +- Keep generated protobuf classes in a generated package. +- Do not expose implementation-only transport helpers as public API. + +## Naming + +- Use `PascalCase` for classes, records, interfaces, and enums. +- Use `camelCase` for methods, fields, parameters, and local variables. +- Use `UPPER_SNAKE_CASE` for constants. +- Use MXAccess terms consistently: `serverHandle`, `itemHandle`, + `mxStatusProxy`, and `hResult`. + +## API Design + +- Prefer immutable options objects with builders for public configuration. +- Implement `AutoCloseable` for clients and sessions that own resources. +- Provide async methods with `CompletableFuture` where useful, but keep a + blocking API for simple CLI workflows. +- Expose raw generated protobuf messages where parity tests need them. + +## Errors + +- Use typed exceptions for gateway, authentication, authorization, session, + worker, command, and MXAccess failures. +- Preserve raw command replies in command exceptions when available. +- Redact API keys, passwords, and secured write values in `toString`, logs, and + CLI output. + +## Streaming + +- Cancel gRPC calls explicitly when callers stop consuming streams. +- Do not reorder, coalesce, or drop events in client code. +- Avoid unbounded queues in async stream helpers. + +## Tests + +- Use JUnit 5. +- Use in-process gRPC servers for unit tests. +- Keep live gateway tests behind `MXGATEWAY_INTEGRATION=1` and JUnit + assumptions. diff --git a/docs/style-guides/ProtobufStyleGuide.md b/docs/style-guides/ProtobufStyleGuide.md new file mode 100644 index 0000000..9545397 --- /dev/null +++ b/docs/style-guides/ProtobufStyleGuide.md @@ -0,0 +1,64 @@ +# Protobuf Style Guide + +This guide defines protobuf conventions for MXAccess Gateway public gRPC and +gateway-to-worker IPC contracts. + +## Baseline + +- Use `proto3`. +- Keep public gateway contracts and worker IPC contracts in separate `.proto` + files. +- Treat field numbers as permanent once released. +- Do not reuse removed field numbers or enum values. Reserve them. +- Keep generated code reproducible from checked-in `.proto` files. + +## Source Documentation + +- Maintain the existing documentation style in the `.proto` file and + surrounding contract docs. +- Write comments that include business-specific or domain-specific context when + that context is available from the contract, surrounding docs, or naming. +- Comment fields that carry MXAccess parity details, credential-sensitive data, + raw HRESULT/status information, or compatibility constraints. +- Avoid comments that restate field types or message nesting. + +## Naming + +- Use `snake_case` for package names, file names, field names, and enum values. +- Use `PascalCase` for message, enum, and service names. +- Prefix enum values with the enum name or a clear abbreviation to avoid name + collisions in generated languages. +- Keep MXAccess event family and command names recognizable in enum values. + +## Compatibility + +- Add fields instead of changing field meaning. +- Use `oneof` for command payloads, reply payloads, value unions, and event + bodies. +- Add explicit `UNKNOWN` or `UNSPECIFIED` enum zero values. +- Preserve raw HRESULT, MXAccess status, and diagnostic metadata in replies and + events. +- Use protocol version fields in worker IPC envelopes. + +## Field Rules + +- Do not use required semantics in application code for newly added optional + fields unless compatibility behavior is documented. +- Prefer explicit wrapper messages for repeated structured values. +- Use signed or unsigned integer types based on the actual semantic range. +- Represent timestamps with `google.protobuf.Timestamp` unless the source value + is not a real timestamp. +- Represent durations with `google.protobuf.Duration`. + +## Security + +- Do not define fields that require clients or workers to log secrets. +- Mark credential-bearing request fields clearly in comments. +- Keep raw values out of diagnostics unless an explicit redacted or opt-in path + exists. + +## Generated Code + +- Do not hand-edit generated code. +- Keep generation commands documented near the contracts project. +- Regenerate all affected language outputs when a contract changes. diff --git a/docs/style-guides/PythonStyleGuide.md b/docs/style-guides/PythonStyleGuide.md new file mode 100644 index 0000000..a1bd163 --- /dev/null +++ b/docs/style-guides/PythonStyleGuide.md @@ -0,0 +1,68 @@ +# Python Style Guide + +This guide defines Python conventions for the MXAccess Gateway Python package, +CLI, and tests. + +## Baseline + +- Target modern supported Python versions defined by `pyproject.toml`. +- Use `pyproject.toml` for package metadata and tool configuration. +- Use type hints for public APIs and non-trivial internal functions. +- Run the configured formatter and linter when the package is available. +- Keep generated protobuf code separate from handwritten modules. + +## Source Documentation + +- Maintain the existing documentation style in the file, package, and + surrounding component. +- Write comments that include business-specific or domain-specific context when + that context is available from the code, surrounding docs, or naming. +- Use docstrings for public classes, functions, and modules when behavior, + parity constraints, or security requirements are not obvious from the name and + type hints. +- Avoid comments that restate syntax or control flow. + +## Package Structure + +- Put library code under `src/mxgateway/`. +- Put CLI entry points under `src/mxgateway_cli/`. +- Keep generated protobuf modules under a clearly named `generated` package. +- Avoid import side effects that open channels, read environment variables, or + start background tasks. + +## Naming + +- Use `snake_case` for functions, variables, modules, and methods. +- Use `PascalCase` for classes and exceptions. +- Use `UPPER_SNAKE_CASE` for constants. +- Keep MXAccess names recognizable in public APIs, even when Python wrappers + use idiomatic method names. + +## Async + +- Make the client async-first. +- Use async context managers for clients and sessions when practical. +- Cancel gRPC streams when async iteration is canceled. +- Document that canceling a Python task does not abort an in-flight MXAccess + COM call inside the worker. + +## Errors + +- Use typed exceptions for transport, authentication, authorization, session, + worker, command, and MXAccess failures. +- Attach raw protobuf replies to command exceptions when available. +- Redact API keys, passwords, and secured write values in exception messages and + CLI output. + +## CLI + +- Keep CLI output deterministic for tests. +- Support JSON output for automation. +- Load API keys from explicit flags or named environment variables. Do not read + secrets implicitly during module import. + +## Tests + +- Use `pytest` and `pytest-asyncio`. +- Use fake generated stubs or an in-process test gRPC server for unit tests. +- Keep live integration tests behind `MXGATEWAY_INTEGRATION=1`. diff --git a/docs/style-guides/RustStyleGuide.md b/docs/style-guides/RustStyleGuide.md new file mode 100644 index 0000000..57db477 --- /dev/null +++ b/docs/style-guides/RustStyleGuide.md @@ -0,0 +1,65 @@ +# Rust Style Guide + +This guide defines Rust conventions for the MXAccess Gateway Rust client crate, +CLI, and tests. + +## Baseline + +- Run `cargo fmt` on every changed Rust file. +- Run `cargo clippy` for non-trivial changes when the crate is available. +- Use the current stable Rust toolchain unless the project pins a version. +- Keep generated protobuf modules isolated from handwritten API wrappers. + +## Source Documentation + +- Maintain the existing documentation style in the file, crate, and surrounding + component. +- Write comments that include business-specific or domain-specific context when + that context is available from the code, surrounding docs, or naming. +- Use `///` documentation for public APIs when behavior, parity constraints, or + security requirements are not obvious from the type signature. +- Avoid comments that restate syntax or control flow. + +## Crate Structure + +- Keep the reusable client library separate from the CLI binary. +- Use small modules for `client`, `session`, `options`, `auth`, `value`, and + `error`. +- Re-export public types intentionally from `lib.rs`. +- Keep generated modules private unless raw protobuf access is part of the API. + +## Naming + +- Use `snake_case` for functions, variables, modules, and fields. +- Use `UpperCamelCase` for types, traits, and enum variants. +- Use `SCREAMING_SNAKE_CASE` for constants. +- Keep protocol names aligned with the protobuf contract where exactness + matters. + +## Async And Ownership + +- Use `async` APIs with `tokio` for network operations. +- Prefer explicit `close` methods for sessions. Do not rely on `Drop` for async + cleanup. +- Avoid unbounded channels and background tasks without a shutdown path. +- Use borrowed parameters such as `&str` where ownership is not needed. + +## Errors + +- Use `thiserror` for library error enums. +- Preserve `tonic::Status`, transport errors, command replies, HRESULTs, and + MXAccess status details. +- Redact API keys and secured values in `Debug`, `Display`, and tracing output. +- Avoid string-only errors for public API failures. + +## Security + +- Use a secret wrapper for API keys when adding a dependency is acceptable. +- Do not derive `Debug` on types that contain unredacted secrets. +- Prefer explicit redacted display implementations for options and errors. + +## Tests + +- Use `#[tokio::test]` for async tests. +- Use fake `tonic` services or trait-backed clients for unit tests. +- Keep live gateway tests behind `MXGATEWAY_INTEGRATION=1`. diff --git a/docs/toolchain-links.md b/docs/toolchain-links.md new file mode 100644 index 0000000..5c2be4a --- /dev/null +++ b/docs/toolchain-links.md @@ -0,0 +1,172 @@ +# Toolchain Links + +This machine has the project toolchain installed for gateway, worker, dashboard, +protobuf/gRPC contracts, and the planned .NET, Go, Rust, Python, and Java +clients. + +If a new terminal cannot find a recently installed tool, refresh PATH in +PowerShell: + +```powershell +$env:Path = [Environment]::GetEnvironmentVariable('Path','Machine') + ';' + [Environment]::GetEnvironmentVariable('Path','User') +``` + +## Core Tools + +| Tool | Version | Path | +| --- | --- | --- | +| Git | 2.53.0.windows.2 | `C:\Program Files\Git\cmd\git.exe` | +| winget | 1.28.240 | `C:\Users\dohertj2\AppData\Local\Microsoft\WindowsApps\winget.exe` | +| SQLCMD | 14.0.1000.169 NT | `C:\Program Files\Microsoft SQL Server\Client SDK\ODBC\130\Tools\Binn\SQLCMD.EXE` | +| SQLite CLI | 3.53.0 | `C:\Users\dohertj2\AppData\Local\Microsoft\WinGet\Links\sqlite3.exe` | +| grpcurl | 1.9.3 | `C:\Users\dohertj2\AppData\Local\Microsoft\WinGet\Links\grpcurl.exe` | + +## .NET And Windows Build Tools + +| Tool | Version | Path | +| --- | --- | --- | +| .NET SDK | 10.0.201 | `C:\Program Files\dotnet\dotnet.exe` | +| .NET runtime | 10.0.5 | `C:\Program Files\dotnet\shared\Microsoft.NETCore.App` | +| ASP.NET Core runtime | 10.0.5 | `C:\Program Files\dotnet\shared\Microsoft.AspNetCore.App` | +| Windows Desktop runtime | 10.0.5 | `C:\Program Files\dotnet\shared\Microsoft.WindowsDesktop.App` | +| MSBuild | 17.14.40.60911 | `C:\Program Files (x86)\Microsoft Visual Studio\2022\BuildTools\MSBuild\Current\Bin\MSBuild.exe` | +| Visual Studio Build Tools | 2022 BuildTools | `C:\Program Files (x86)\Microsoft Visual Studio\2022\BuildTools` | +| VC tools | 14.44.35207 | `C:\Program Files (x86)\Microsoft Visual Studio\2022\BuildTools\VC\Tools\MSVC\14.44.35207` | +| C compiler x64 | 14.44.35207 | `C:\Program Files (x86)\Microsoft Visual Studio\2022\BuildTools\VC\Tools\MSVC\14.44.35207\bin\Hostx64\x64\cl.exe` | +| Linker x64 | 14.44.35207 | `C:\Program Files (x86)\Microsoft Visual Studio\2022\BuildTools\VC\Tools\MSVC\14.44.35207\bin\Hostx64\x64\link.exe` | +| C compiler x86 | 14.44.35207 | `C:\Program Files (x86)\Microsoft Visual Studio\2022\BuildTools\VC\Tools\MSVC\14.44.35207\bin\Hostx64\x86\cl.exe` | +| Linker x86 | 14.44.35207 | `C:\Program Files (x86)\Microsoft Visual Studio\2022\BuildTools\VC\Tools\MSVC\14.44.35207\bin\Hostx64\x86\link.exe` | +| LibMan CLI | 3.0.71 | `C:\Users\dohertj2\.dotnet\tools\libman.exe` | + +Reference assemblies: + +```text +C:\Program Files (x86)\Reference Assemblies\Microsoft\Framework\.NETFramework\v4.8 +C:\Program Files (x86)\Reference Assemblies\Microsoft\Framework\.NETFramework\v4.8.1 +``` + +Environment: + +```text +DOTNET_ROOT=C:\Program Files\dotnet +``` + +Use `dotnet build` for SDK-style projects. Use the full MSBuild path above when +a .NET Framework or COM interop build needs classic Visual Studio MSBuild. + +## Go + +| Tool | Version | Path | +| --- | --- | --- | +| Go | 1.26.2 windows/amd64 | `C:\Program Files\Go\bin\go.exe` | +| protoc-gen-go | latest installed by `go install` | `C:\Users\dohertj2\go\bin\protoc-gen-go.exe` | +| protoc-gen-go-grpc | latest installed by `go install` | `C:\Users\dohertj2\go\bin\protoc-gen-go-grpc.exe` | + +Environment: + +```text +GOROOT=C:\Program Files\Go +GOPATH=C:\Users\dohertj2\go +Go plugin bin=C:\Users\dohertj2\go\bin +``` + +Installed plugin commands: + +```powershell +go install google.golang.org/protobuf/cmd/protoc-gen-go@latest +go install google.golang.org/grpc/cmd/protoc-gen-go-grpc@latest +``` + +## Rust + +| Tool | Version | Path | +| --- | --- | --- | +| rustup | 1.29.0 | `C:\Users\dohertj2\.cargo\bin\rustup.exe` | +| rustc | 1.95.0 | `C:\Users\dohertj2\.cargo\bin\rustc.exe` | +| cargo | 1.95.0 | `C:\Users\dohertj2\.cargo\bin\cargo.exe` | + +Rust uses the MSVC toolchain installed with Visual Studio Build Tools. The +default Rust linker smoke test has passed. + +User cargo bin: + +```text +C:\Users\dohertj2\.cargo\bin +``` + +## Python + +| Tool | Version | Path | +| --- | --- | --- | +| Python | 3.12.10 | `C:\Users\dohertj2\AppData\Local\Programs\Python\Python312\python.exe` | +| Python launcher | installed | `C:\Users\dohertj2\AppData\Local\Programs\Python\Launcher\py.exe` | +| pip | 26.0.1 | Python 3.12 site packages | + +Installed Python packages: + +```text +grpcio==1.80.0 +grpcio-tools==1.80.0 +protobuf==6.33.6 +pytest==9.0.3 +pytest-asyncio==1.3.0 +click==8.3.3 +typer==0.25.0 +``` + +## Java + +| Tool | Version | Path | +| --- | --- | --- | +| Java runtime | Temurin 21.0.10+7 LTS | `C:\Program Files\Eclipse Adoptium\jdk-21.0.10.7-hotspot\bin\java.exe` | +| Java compiler | Temurin 21.0.10+7 LTS | `C:\Program Files\Eclipse Adoptium\jdk-21.0.10.7-hotspot\bin\javac.exe` | +| Gradle | 9.4.1 | `C:\Tools\gradle-9.4.1\bin\gradle.bat` | +| Maven | 3.9.15 | `C:\Tools\apache-maven-3.9.15\bin\mvn.cmd` | + +Environment: + +```text +JAVA_HOME=C:\Program Files\Eclipse Adoptium\jdk-21.0.10.7-hotspot +GRADLE_HOME=C:\Tools\gradle-9.4.1 +MAVEN_HOME=C:\Tools\apache-maven-3.9.15 +``` + +## Protobuf And gRPC + +| Tool | Version | Path | +| --- | --- | --- | +| protoc | 34.1 | `C:\Users\dohertj2\AppData\Local\Microsoft\WinGet\Packages\Google.Protobuf_Microsoft.Winget.Source_8wekyb3d8bbwe\bin\protoc.exe` | +| Buf | 1.68.4 | `C:\Users\dohertj2\AppData\Local\Microsoft\WinGet\Links\buf.exe` | +| grpcurl | 1.9.3 | `C:\Users\dohertj2\AppData\Local\Microsoft\WinGet\Links\grpcurl.exe` | + +Generated code should be reproducible from the shared `.proto` files. Do not +hand-edit generated protobuf or gRPC code. + +## Project-Specific External References + +MXAccess analysis: + +```text +C:\Users\dohertj2\Desktop\mxaccess +C:\Users\dohertj2\Desktop\mxaccess\docs\MXAccess-Public-API.md +C:\Users\dohertj2\Desktop\mxaccess\docs\MXAccess-Reverse-Engineering.md +``` + +Galaxy Repository SQL notes: + +```text +C:\Users\dohertj2\Desktop\lmxopcua\gr +C:\Users\dohertj2\Desktop\lmxopcua\gr\connectioninfo.md +C:\Users\dohertj2\Desktop\lmxopcua\gr\queries +``` + +## Smoke Checks Performed + +These checks passed after installation: + +- `dotnet build` of a temporary `net48` class library. +- `cargo build` of a temporary Rust binary using the MSVC linker. +- `go build` of a temporary Go module. +- `javac` compile of a temporary Java class. +- Python imports for `grpc`, `grpc_tools`, and `pytest`. + diff --git a/gateway.md b/gateway.md index 68a6d2d..54778b3 100644 --- a/gateway.md +++ b/gateway.md @@ -40,6 +40,37 @@ The worker does not host gRPC. The gateway talks to workers through a small local IPC protocol. Named pipes with protobuf-framed messages are the default transport. +Detailed follow-up docs: + +- `docs/gateway-process-design.md` covers the .NET 10 gateway process, + session manager, worker supervision, gRPC API, event streaming, fault model, + security, observability, and test strategy. +- `docs/mxaccess-worker-instance-design.md` covers each .NET Framework 4.8 x86 + MXAccess worker instance, including STA ownership, message pumping, COM + lifetime, command dispatch, event sinks, conversion, and shutdown. +- `docs/design-decisions.md` records current v1 choices, including API-key + authentication in gateway-owned SQLite and the concrete installed MXAccess + COM class details from `C:\Users\dohertj2\Desktop\mxaccess`. +- `docs/gateway-dashboard-design.md` covers the Blazor Server and Bootstrap + dashboard for live gateway/session/worker status. +- `docs/client-libraries-design.md` covers shared design requirements for + official gRPC client libraries, test CLIs, and tests for .NET C#, Go, Rust, + Python, and Java. +- `docs/implementation-plan-index.md` links the detailed implementation plans + and recommended Gitea milestones/issues. + +Implementation style guides: + +- `StyleGuide.md` covers project documentation. +- `docs/style-guides/CSharpStyleGuide.md` covers gateway, worker, .NET client, + and C# tests. +- `docs/style-guides/ProtobufStyleGuide.md` covers public gRPC and worker IPC + contracts. +- `docs/style-guides/GoStyleGuide.md` covers the Go client. +- `docs/style-guides/RustStyleGuide.md` covers the Rust client. +- `docs/style-guides/PythonStyleGuide.md` covers the Python client. +- `docs/style-guides/JavaStyleGuide.md` covers the Java client. + ## Process Split ### Gateway Process @@ -866,16 +897,30 @@ backend = mxaccess-worker ## Open Questions -- Exact installed MXAccess COM ProgID/class used by production should be pinned - from the existing trace harness. -- Whether one gRPC client connection maps to one session or whether sessions can - survive client reconnects. -- Whether event streams can have multiple subscribers per session. -- Required authentication model for remote clients. -- Whether worker process identity should be the gateway identity or a restricted - service account. -- Maximum supported event rate before coalescing is required. -- Whether command batching is needed for high-volume tag registration. +Current v1 decisions are recorded in `docs/design-decisions.md`. + +Resolved for v1: + +- MXAccess COM target is `ArchestrA.MxAccess.LMXProxyServerClass` / + `LMXProxy.LMXProxyServer.1` from the installed 32-bit `LmxProxy.dll`. +- One `OpenSession` maps to one worker process; no reconnectable sessions. +- One active event subscriber per session. +- API key authentication with hashed keys in gateway-owned SQLite. +- Basic Blazor Server dashboard with Bootstrap CSS/JS and real-time updates. +- Workers run as the gateway service identity. +- Event backpressure is fail-fast with bounded queues. +- No public command batching. +- `OperationComplete` is forwarded only when native MXAccess raises it. +- `OnBufferedDataChange` is modeled now; multi-sample payload conversion remains + capture-validated work. + +Post-v1 revisit items: + +- production event-rate target and optional coalescing, +- reconnectable sessions, +- multi-subscriber event fan-out, +- restricted worker process identity, +- command batching for high-volume setup. ## Recommended Next Step