Fix runtime review findings
This commit is contained in:
+34
-13
@@ -32,13 +32,14 @@ The service is defined in
|
||||
|-----|---------|
|
||||
| `TestConnection` | Connectivity probe. Returns `{ ok: bool }` after a `SELECT 1`. Does not throw on SQL failure — returns `ok = false`. Always hits SQL directly so it remains a true health check. |
|
||||
| `GetLastDeployTime` | Returns the cached `galaxy.time_of_last_deploy`. Served from the shared hierarchy cache; refreshed in the background. |
|
||||
| `DiscoverHierarchy` | Returns the full deployed hierarchy plus every object's dynamic attributes. **Served from cache** — see [Hierarchy Cache](#hierarchy-cache). |
|
||||
| `DiscoverHierarchy` | Returns one page of the deployed hierarchy plus each returned object's dynamic attributes. **Served from cache** — see [Hierarchy Cache](#hierarchy-cache). |
|
||||
| `WatchDeployEvents` | **Server-streaming.** The server emits the current state immediately on subscribe (so clients can bootstrap without waiting), then emits one event per detected deploy change. See [Deploy Notifications](#deploy-notifications). |
|
||||
|
||||
`DiscoverHierarchy` is intentionally a single unary RPC rather than a stream:
|
||||
the row set is small (thousands of objects, low tens-of-thousands of
|
||||
attributes for typical Galaxies) and clients almost always want the whole tree
|
||||
at once.
|
||||
`DiscoverHierarchy` is a paged unary RPC. The raw request accepts `page_size`
|
||||
and `page_token`; the server defaults omitted page size to 1000 objects and
|
||||
caps every page at 5000 objects. Invalid page tokens and negative page sizes
|
||||
return `InvalidArgument`. Official high-level clients preserve the older
|
||||
"return the full hierarchy" behavior by looping pages internally.
|
||||
|
||||
## Hierarchy Cache
|
||||
|
||||
@@ -56,12 +57,14 @@ Refresh strategy is **deploy-time gated**:
|
||||
3. If the deploy timestamp is unchanged, the heavy hierarchy + attributes
|
||||
queries are **skipped**. The cache simply marks `LastSuccessAt`.
|
||||
4. If the deploy timestamp changed (or no data has loaded yet), the cache
|
||||
pulls hierarchy + attributes, materializes a `DiscoverHierarchyReply`
|
||||
once, replaces the entry atomically, and publishes a deploy event.
|
||||
pulls hierarchy + attributes, materializes a Galaxy object list plus a
|
||||
dashboard summary once, replaces the entry atomically, and publishes a
|
||||
deploy event.
|
||||
|
||||
Materializing the reply at refresh time means subsequent `DiscoverHierarchy`
|
||||
calls return a pre-built proto message — no per-request projection, no
|
||||
per-request allocations beyond the gRPC serializer's frame.
|
||||
Materializing objects and dashboard summaries at refresh time means subsequent
|
||||
`DiscoverHierarchy` calls page over an immutable object list. The dashboard
|
||||
uses the precomputed summary and does not rescan raw SQL rowsets on each
|
||||
snapshot.
|
||||
|
||||
When SQL is unreachable, the cache retains the previous data and flips
|
||||
`Status` to `Stale` (or `Unavailable` if no data was ever loaded). A
|
||||
@@ -139,6 +142,17 @@ message GalaxyAttribute {
|
||||
bool is_historized = 10;
|
||||
bool is_alarm = 11;
|
||||
}
|
||||
|
||||
message DiscoverHierarchyRequest {
|
||||
int32 page_size = 1; // omitted/0 uses the server default of 1000
|
||||
string page_token = 2; // opaque offset token returned by the previous page
|
||||
}
|
||||
|
||||
message DiscoverHierarchyReply {
|
||||
repeated GalaxyObject objects = 1;
|
||||
string next_page_token = 2;
|
||||
int32 total_object_count = 3;
|
||||
}
|
||||
```
|
||||
|
||||
### Contained Name vs Tag Name
|
||||
@@ -176,7 +190,8 @@ GalaxyHierarchyRefreshService (BackgroundService)
|
||||
-> GalaxyRepository.GetLastDeployTimeAsync (cheap, every tick)
|
||||
-> GalaxyRepository.GetHierarchyAsync (only on deploy change)
|
||||
-> GalaxyRepository.GetAttributesAsync (only on deploy change)
|
||||
-> GalaxyProtoMapper.MapObject (materialize DiscoverHierarchyReply once)
|
||||
-> GalaxyProtoMapper.MapObject (materialize GalaxyObject list once)
|
||||
-> DashboardGalaxySummary (precompute dashboard counts once)
|
||||
-> IGalaxyDeployNotifier.Publish (only on deploy change)
|
||||
```
|
||||
|
||||
@@ -189,8 +204,9 @@ Component breakdown:
|
||||
recursive CTEs and pick the most-derived attribute override per object.
|
||||
- `GalaxyHierarchyCache`
|
||||
(`src/MxGateway.Server/Galaxy/GalaxyHierarchyCache.cs`) holds the most
|
||||
recent immutable `GalaxyHierarchyCacheEntry` (rows + materialized proto
|
||||
reply + counts + status). All gRPC clients share the same entry.
|
||||
recent immutable `GalaxyHierarchyCacheEntry` (materialized objects +
|
||||
precomputed dashboard summary + counts + status). All gRPC clients share the
|
||||
same entry.
|
||||
- `GalaxyHierarchyRefreshService`
|
||||
(`src/MxGateway.Server/Galaxy/GalaxyHierarchyRefreshService.cs`) is a
|
||||
hosted `BackgroundService` that drives `RefreshAsync` on the configured
|
||||
@@ -220,6 +236,11 @@ Security`), but production deployments that use SQL authentication should set
|
||||
the override via environment variable rather than committing credentials to
|
||||
`appsettings.json`.
|
||||
|
||||
The dashboard parses this connection string and displays only non-secret
|
||||
fields: server, database, integrated security, encrypt, and trust-server-
|
||||
certificate. It never displays user id, password, access token, or arbitrary
|
||||
unparsed connection string text.
|
||||
|
||||
## Authorization
|
||||
|
||||
All four Galaxy RPCs (including `WatchDeployEvents`) require the
|
||||
|
||||
@@ -35,6 +35,8 @@ paths, timeouts, queue sizes, enum values, or protocol values are invalid.
|
||||
"DefaultCommandTimeoutSeconds": 30,
|
||||
"MaxSessions": 64,
|
||||
"MaxPendingCommandsPerSession": 128,
|
||||
"DefaultLeaseSeconds": 1800,
|
||||
"LeaseSweepIntervalSeconds": 30,
|
||||
"AllowMultipleEventSubscribers": false
|
||||
},
|
||||
"Events": {
|
||||
@@ -52,7 +54,8 @@ paths, timeouts, queue sizes, enum values, or protocol values are invalid.
|
||||
"ShowTagValues": false
|
||||
},
|
||||
"Protocol": {
|
||||
"WorkerProtocolVersion": 1
|
||||
"WorkerProtocolVersion": 1,
|
||||
"MaxGrpcMessageBytes": 16777216
|
||||
},
|
||||
"Galaxy": {
|
||||
"ConnectionString": "Server=localhost;Database=ZB;Integrated Security=True;TrustServerCertificate=True;Encrypt=False;",
|
||||
@@ -107,6 +110,8 @@ to avoid accidental large allocations from malformed or oversized frames.
|
||||
| `MxGateway:Sessions:DefaultCommandTimeoutSeconds` | `30` | Default timeout used while the gateway waits for a worker command reply when an open-session request does not provide a positive command timeout. |
|
||||
| `MxGateway:Sessions:MaxSessions` | `64` | Maximum number of concurrently open gateway sessions. Session opens reserve a slot atomically before worker creation. |
|
||||
| `MxGateway:Sessions:MaxPendingCommandsPerSession` | `128` | Maximum number of pending worker commands for one session. Excess commands fail fast instead of queueing indefinitely. |
|
||||
| `MxGateway:Sessions:DefaultLeaseSeconds` | `1800` | Initial session lease and refresh duration. Unary client activity extends the lease by this duration. |
|
||||
| `MxGateway:Sessions:LeaseSweepIntervalSeconds` | `30` | Hosted monitor interval for closing expired leases. Active event-stream subscribers keep a session from expiring while the stream remains attached. |
|
||||
| `MxGateway:Sessions:AllowMultipleEventSubscribers` | `false` | Controls whether multiple `StreamEvents` subscribers may attach to one session. `true` is rejected until event fan-out is implemented. |
|
||||
|
||||
All numeric session options must be greater than zero. The current event stream
|
||||
@@ -146,6 +151,7 @@ and `RecentSessionLimit` must be greater than or equal to zero.
|
||||
| Option | Default | Description |
|
||||
|--------|---------|-------------|
|
||||
| `MxGateway:Protocol:WorkerProtocolVersion` | `1` | Worker IPC protocol version expected by the gateway and worker. This must match `GatewayContractInfo.WorkerProtocolVersion`. |
|
||||
| `MxGateway:Protocol:MaxGrpcMessageBytes` | `16777216` | Public gRPC max send and receive message size in bytes. The same default is used by official clients. The validator allows values from `1024` through `268435456`. |
|
||||
|
||||
The protocol option is exposed for diagnostics and explicit deployment
|
||||
configuration, not for compatibility negotiation. A mismatch fails validation
|
||||
|
||||
@@ -31,6 +31,11 @@ A second gRPC service, `GalaxyRepositoryGrpcService`, is mapped alongside it. It
|
||||
|
||||
`MxAccessGatewayService` derives from the generated `MxAccessGateway.MxAccessGatewayBase` and implements every RPC declared in `mxaccess_gateway.proto`. The proto contract itself is documented in [Contracts](./Contracts.md); this section covers only what the server-side handler does on top of that contract.
|
||||
|
||||
Public gRPC send and receive message sizes are configured from
|
||||
`MxGateway:Protocol:MaxGrpcMessageBytes` (default 16 MiB). Official clients use
|
||||
the same default so paged Galaxy browse replies and larger MXAccess payloads
|
||||
fail consistently instead of depending on language-specific gRPC defaults.
|
||||
|
||||
### `OpenSession`
|
||||
|
||||
`OpenSession` validates the request, asks `ISessionManager` to open a session under the caller's identity, and returns a reply that advertises both protocol versions and the capabilities the gateway supports. Capability strings are static because the gateway has a fixed feature set per build; clients use them as a forward-compatibility hint rather than runtime negotiation.
|
||||
|
||||
+2
-2
@@ -178,9 +178,9 @@ The order — fault, deregister, dispose, release slot, record metric, log, reth
|
||||
|
||||
While `Ready`, callers reach the worker through `SessionManager.InvokeAsync` or `ReadEventsAsync`. Both delegate to `GatewaySession`, which checks the state under lock and updates `LastClientActivityAt` on every invocation. `GatewaySession` also exposes typed bulk helpers (`AddItemBulkAsync`, `SubscribeBulkAsync`, etc.) that wrap `WorkerCommand` round-trips and translate non-`Ok` `ProtocolStatus` replies into `SessionManagerException` with `SessionNotReady`.
|
||||
|
||||
Event streaming uses `AttachEventSubscriber` which returns a disposable lease. When `allowMultipleSubscribers` is false the second attach throws `EventSubscriberAlreadyActive`; this prevents two gRPC streams from racing on the same worker event channel.
|
||||
Event streaming uses `AttachEventSubscriber` which returns a disposable lease. When `allowMultipleSubscribers` is false the second attach throws `EventSubscriberAlreadyActive`; this prevents two gRPC streams from racing on the same worker event channel. Active event subscribers keep the session lease from expiring until the stream is disposed.
|
||||
|
||||
`ExtendLease` and `IsLeaseExpired` cooperate with `SessionManager.CloseExpiredLeasesAsync`, which iterates a registry snapshot and closes any session whose lease has expired with `LeaseExpiredReason`.
|
||||
Sessions open with `MxGateway:Sessions:DefaultLeaseSeconds` (default 1800) added to the open timestamp. Unary client activity refreshes the lease by the same duration. `ExtendLease` and `IsLeaseExpired` cooperate with `SessionManager.CloseExpiredLeasesAsync`, which iterates a registry snapshot and closes any session whose lease has expired with `LeaseExpiredReason`. `SessionLeaseMonitorHostedService` runs that sweep every `MxGateway:Sessions:LeaseSweepIntervalSeconds` seconds (default 30).
|
||||
|
||||
### Close
|
||||
|
||||
|
||||
Reference in New Issue
Block a user