Files
mxaccessgw/docs/GatewayConfiguration.md
T
Joseph Doherty dc9c0c950c rename: prefix gateway projects/namespaces with ZB.MOM.WW + sln→slnx
Apply the ZB.MOM.WW. prefix to all gateway-side projects, folders,
.csproj/.sln contents, C# namespaces, using directives, generated proto
C# (csharp_namespace + checked-in generated files), InternalsVisibleTo
attributes, project-name string literals (LoadProject, .sln lookups,
worker exe paths, staticwebassets manifest), and the install/script/doc
references that point at any of the above. Migrate the solution from
.sln to .slnx via `dotnet sln migrate` and delete the old file.

External-runtime identifiers are intentionally NOT prefixed so external
configuration keeps working:
- GatewayMetrics.cs MeterName ("MxGateway.Server")
- DashboardAuthenticationDefaults Scheme/Policy ("MxGateway.Dashboard")
- GatewayRequestLoggingMiddleware logger category ("MxGateway.Request")
- StaRuntime thread name ("MxGateway.Worker.STA")
- appsettings.json root section "MxGateway" + env-var prefix
  MxGateway__... and secret-name MxGateway:ApiKeyPepper
- C:\ProgramData\MxGateway\ data dir paths

Also fixes two tests that were not rename-related but became visible
while validating the rename:

- WorkerLiveMxAccessSmokeTests.ShutDownAsync: cancellation that the
  gateway service correctly maps to RpcException(Cancelled) per gRPC
  convention was being misclassified as a stream fault. Added a sibling
  catch on RpcException with StatusCode.Cancelled.

- IntegrationTestEnvironment.ResolveRepositoryRoot: extracted IsRepositoryRoot
  and made it accept either a .git marker OR a .sln/.slnx next to src/
  so the worker-exe walker works in non-git working copies.

clients/proto/proto-inputs.json's protoRoot updated to point at
src/ZB.MOM.WW.MxGateway.Contracts/Protos.

Verified by `dotnet build` and a full `dotnet test` of the .slnx with
MXGATEWAY_RUN_LIVE_{MXACCESS,LDAP,GALAXY}_TESTS=1:
  Tests: 472/472 pass
  Worker.Tests: 280/280 pass (4 dev-rig [Fact(Skip=...)] skipped)
  IntegrationTests: 18/18 pass

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-23 16:22:23 -04:00

200 lines
13 KiB
Markdown

# Gateway Configuration
This document describes every option bound under the `MxGateway` configuration
section by `GatewayOptions`.
The gateway binds configuration at startup and validates it with
`GatewayOptionsValidator`. Startup fails before the server listens when required
paths, timeouts, queue sizes, enum values, or protocol values are invalid.
## Configuration Shape
```json
{
"MxGateway": {
"Authentication": {
"Mode": "ApiKey",
"SqlitePath": "C:\\ProgramData\\MxGateway\\gateway-auth.db",
"PepperSecretName": "MxGateway:ApiKeyPepper",
"RunMigrationsOnStartup": true
},
"Worker": {
"ExecutablePath": "src\\ZB.MOM.WW.MxGateway.Worker\\bin\\x86\\Release\\ZB.MOM.WW.MxGateway.Worker.exe",
"WorkingDirectory": null,
"RequiredArchitecture": "X86",
"StartupTimeoutSeconds": 30,
"StartupProbeRetryAttempts": 3,
"StartupProbeRetryDelayMilliseconds": 250,
"PipeConnectAttemptTimeoutMilliseconds": 2000,
"ShutdownTimeoutSeconds": 10,
"HeartbeatIntervalSeconds": 5,
"HeartbeatGraceSeconds": 15,
"MaxMessageBytes": 16777216
},
"Sessions": {
"DefaultCommandTimeoutSeconds": 30,
"MaxSessions": 64,
"MaxPendingCommandsPerSession": 128,
"DefaultLeaseSeconds": 1800,
"LeaseSweepIntervalSeconds": 30,
"AllowMultipleEventSubscribers": false
},
"Events": {
"QueueCapacity": 10000,
"BackpressurePolicy": "FailFast"
},
"Dashboard": {
"Enabled": true,
"PathBase": "/dashboard",
"RequireAdminScope": true,
"AllowAnonymousLocalhost": true,
"SnapshotIntervalMilliseconds": 1000,
"RecentFaultLimit": 100,
"RecentSessionLimit": 200,
"ShowTagValues": false
},
"Protocol": {
"WorkerProtocolVersion": 1,
"MaxGrpcMessageBytes": 16777216
},
"Galaxy": {
"ConnectionString": "Server=localhost;Database=ZB;Integrated Security=True;TrustServerCertificate=True;Encrypt=False;",
"CommandTimeoutSeconds": 60,
"DashboardRefreshIntervalSeconds": 30,
"PersistSnapshot": true,
"SnapshotCachePath": "C:\\ProgramData\\MxGateway\\galaxy-snapshot.json"
},
"Alarms": {
"Enabled": false,
"SubscriptionExpression": "",
"DefaultArea": "",
"ReconcileIntervalSeconds": 30
}
}
}
```
Environment variables use the normal .NET double-underscore form. For example,
`MxGateway__Sessions__MaxSessions=20` overrides
`MxGateway:Sessions:MaxSessions`.
## Authentication Options
| Option | Default | Description |
|--------|---------|-------------|
| `MxGateway:Authentication:Mode` | `ApiKey` | Selects public gRPC authentication. Supported values are `ApiKey` and `Disabled`. `Disabled` bypasses API-key verification and is for local development only. |
| `MxGateway:Authentication:SqlitePath` | `C:\ProgramData\MxGateway\gateway-auth.db` | SQLite database path for API-key records and audit rows when API-key authentication is enabled. |
| `MxGateway:Authentication:PepperSecretName` | `MxGateway:ApiKeyPepper` | Configuration key used to read the HMAC pepper for API-key secret hashing. The dashboard effective configuration redacts this value. |
| `MxGateway:Authentication:RunMigrationsOnStartup` | `true` | Runs SQLite auth schema migrations at gateway startup when API-key authentication is enabled. |
When `Mode` is `ApiKey`, `SqlitePath` and `PepperSecretName` must be present.
`SqlitePath` must be a valid filesystem path.
## Worker Options
| Option | Default | Description |
|--------|---------|-------------|
| `MxGateway:Worker:ExecutablePath` | `src\ZB.MOM.WW.MxGateway.Worker\bin\x86\Release\ZB.MOM.WW.MxGateway.Worker.exe` | Path to the x86 worker executable launched for each gateway session. The path must be valid and point to a `.exe` file. |
| `MxGateway:Worker:WorkingDirectory` | `null` | Optional working directory for the worker process. When set, it must be a valid filesystem path. |
| `MxGateway:Worker:RequiredArchitecture` | `X86` | Required Portable Executable architecture for the worker. Supported values are `X86` and `X64`; MXAccess parity uses `X86`. |
| `MxGateway:Worker:StartupTimeoutSeconds` | `30` | Total startup budget for process launch, startup probe, pipe connect, handshake, and worker readiness. |
| `MxGateway:Worker:StartupProbeRetryAttempts` | `3` | Number of retry attempts for transient worker startup probe failures before pipe connection and handshake continue. |
| `MxGateway:Worker:StartupProbeRetryDelayMilliseconds` | `250` | Delay between transient startup probe retry attempts. |
| `MxGateway:Worker:PipeConnectAttemptTimeoutMilliseconds` | `2000` | Per-attempt timeout used by the worker named-pipe connect retry path. The overall pipe connection still stays under the startup budget. |
| `MxGateway:Worker:ShutdownTimeoutSeconds` | `10` | Grace period for worker shutdown before the gateway treats shutdown as failed and may kill the worker process tree. |
| `MxGateway:Worker:HeartbeatIntervalSeconds` | `5` | Worker heartbeat send interval and gateway heartbeat check cadence input. |
| `MxGateway:Worker:HeartbeatGraceSeconds` | `15` | Maximum age of the last worker heartbeat before the gateway faults the worker. This must be greater than or equal to `HeartbeatIntervalSeconds`. |
| `MxGateway:Worker:MaxMessageBytes` | `16777216` | Maximum worker IPC frame payload size in bytes. The validator allows values from `1024` through `268435456`. |
`StartupProbeRetryAttempts`, `StartupProbeRetryDelayMilliseconds`,
`PipeConnectAttemptTimeoutMilliseconds`, timeout values, heartbeat values, and
`MaxMessageBytes` must be positive. `MaxMessageBytes` is intentionally bounded
to avoid accidental large allocations from malformed or oversized frames.
## Session Options
| Option | Default | Description |
|--------|---------|-------------|
| `MxGateway:Sessions:DefaultCommandTimeoutSeconds` | `30` | Default timeout used while the gateway waits for a worker command reply when an open-session request does not provide a positive command timeout. |
| `MxGateway:Sessions:MaxSessions` | `64` | Maximum number of concurrently open gateway sessions. Session opens reserve a slot atomically before worker creation. |
| `MxGateway:Sessions:MaxPendingCommandsPerSession` | `128` | Maximum number of pending worker commands for one session. Excess commands fail fast instead of queueing indefinitely. |
| `MxGateway:Sessions:DefaultLeaseSeconds` | `1800` | Initial session lease and refresh duration. Unary client activity extends the lease by this duration. |
| `MxGateway:Sessions:LeaseSweepIntervalSeconds` | `30` | Hosted monitor interval for closing expired leases. Active event-stream subscribers keep a session from expiring while the stream remains attached. |
| `MxGateway:Sessions:AllowMultipleEventSubscribers` | `false` | Controls whether multiple `StreamEvents` subscribers may attach to one session. `true` is rejected until event fan-out is implemented. |
All numeric session options must be greater than zero. The current event stream
implementation supports one active subscriber per session; this preserves event
ordering and avoids competing consumers.
## Event Options
| Option | Default | Description |
|--------|---------|-------------|
| `MxGateway:Events:QueueCapacity` | `10000` | Capacity for bounded per-session event queues used by the gateway worker event channel and the public gRPC event stream queue. |
| `MxGateway:Events:BackpressurePolicy` | `FailFast` | Event backpressure behavior. `FailFast` faults the session on public stream queue overflow. `DisconnectSubscriber` disconnects only the slow stream. |
`QueueCapacity` must be greater than zero. With `FailFast`, queue overflow
faults the affected worker or session instead of silently dropping MXAccess
events. With `DisconnectSubscriber`, public gRPC stream overflow terminates only
the affected stream while the MXAccess session remains active.
## Dashboard Options
| Option | Default | Description |
|--------|---------|-------------|
| `MxGateway:Dashboard:Enabled` | `true` | Enables Blazor Server dashboard route mapping. |
| `MxGateway:Dashboard:PathBase` | `/dashboard` | Base path for dashboard routes. When the dashboard is enabled, this value is required and must start with `/`. |
| `MxGateway:Dashboard:RequireAdminScope` | `true` | Requires API keys used for dashboard login to carry the `admin` scope. |
| `MxGateway:Dashboard:AllowAnonymousLocalhost` | `true` | Allows loopback dashboard requests to bypass the dashboard cookie requirement for local development. Remote requests still require dashboard authentication. |
| `MxGateway:Dashboard:SnapshotIntervalMilliseconds` | `1000` | Dashboard snapshot refresh interval used by realtime Blazor pages. |
| `MxGateway:Dashboard:RecentFaultLimit` | `100` | Maximum number of fault summaries projected into each dashboard snapshot. |
| `MxGateway:Dashboard:RecentSessionLimit` | `200` | Maximum number of session summaries projected into each dashboard snapshot. |
| `MxGateway:Dashboard:ShowTagValues` | `false` | Reserved display control for tag values. The dashboard does not show full tag values by default. |
`SnapshotIntervalMilliseconds` must be greater than zero. `RecentFaultLimit`
and `RecentSessionLimit` must be greater than or equal to zero.
## Protocol Options
| Option | Default | Description |
|--------|---------|-------------|
| `MxGateway:Protocol:WorkerProtocolVersion` | `1` | Worker IPC protocol version expected by the gateway and worker. This must match `GatewayContractInfo.WorkerProtocolVersion`. |
| `MxGateway:Protocol:MaxGrpcMessageBytes` | `16777216` | Public gRPC max send and receive message size in bytes. The same default is used by official clients. The validator allows values from `1024` through `268435456`. |
The protocol option is exposed for diagnostics and explicit deployment
configuration, not for compatibility negotiation. A mismatch fails validation
at startup.
## Galaxy Options
| Option | Default | Description |
|--------|---------|-------------|
| `MxGateway:Galaxy:ConnectionString` | `Server=localhost;Database=ZB;Integrated Security=True;TrustServerCertificate=True;Encrypt=False;` | SQL Server connection string for the Galaxy Repository (`ZB`) used by the `GalaxyRepository` browse RPCs. Override in production via `MxGateway__Galaxy__ConnectionString`. |
| `MxGateway:Galaxy:CommandTimeoutSeconds` | `60` | Per-command SQL timeout for all Galaxy browse RPCs. |
| `MxGateway:Galaxy:DashboardRefreshIntervalSeconds` | `30` | Interval between background refreshes of the dashboard Galaxy summary cache. SQL is hit at most once per interval regardless of dashboard render rate. |
| `MxGateway:Galaxy:PersistSnapshot` | `true` | Persists the latest successful Galaxy browse dataset to disk. When `true`, the cache reloads that snapshot at startup so clients can still browse last-known data while the Galaxy database is unreachable. The restored data is served with `Stale` status until a live query confirms it. |
| `MxGateway:Galaxy:SnapshotCachePath` | `C:\ProgramData\MxGateway\galaxy-snapshot.json` | File path for the persisted Galaxy browse snapshot. Ignored when `PersistSnapshot` is `false`. The snapshot is written atomically (temp file plus rename). |
See [Galaxy Repository Browse](./GalaxyRepository.md) for the RPC surface and
behavior.
## Alarm Options
| Option | Default | Description |
|--------|---------|-------------|
| `MxGateway:Alarms:Enabled` | `false` | Gates the gateway's always-on central alarm monitor. When `true`, the gateway opens one gateway-owned worker session dedicated to alarms, caches the active-alarm set, and fans it out to every client through the `StreamAlarms` RPC — no client opens its own session to see alarms. |
| `MxGateway:Alarms:SubscriptionExpression` | _(empty)_ | AVEVA alarm-subscription expression the monitor subscribes on startup, in canonical `\\<machine>\Galaxy!<area>` form. The literal `Galaxy` provider is correct regardless of the Galaxy database name. When empty and `Enabled` is `true`, the gateway falls back to `\\<MachineName>\Galaxy!<DefaultArea>` if `DefaultArea` is set. |
| `MxGateway:Alarms:DefaultArea` | _(empty)_ | Area name used to compose a default subscription when `SubscriptionExpression` is empty. If both are empty while `Enabled` is `true`, the monitor faults with a configuration diagnostic. |
| `MxGateway:Alarms:ReconcileIntervalSeconds` | `30` | How often the monitor reconciles its in-process alarm cache against the worker's authoritative active-alarm snapshot, catching transitions the live poll-and-diff feed missed. Floored at 5 seconds. |
The alarm monitor is independent of client sessions: `AcknowledgeAlarm` and
`StreamAlarms` are session-less RPCs served by the monitor.
## Related Documentation
- [Gateway Process Detailed Design](./GatewayProcessDesign.md)
- [Gateway Dashboard Detailed Design](./GatewayDashboardDesign.md)
- [Worker Process Launcher](./WorkerProcessLauncher.md)
- [Worker Frame Protocol](./WorkerFrameProtocol.md)
- [Galaxy Repository Browse](./GalaxyRepository.md)