Code-review 2026-05-20 sweep: re-review at 1cd51bb, resolve 72 findings across all 11 modules

Re-reviewed every module/client against the 10-category checklist
(REVIEW-PROCESS.md) at commit 1cd51bb, filed 72 new findings, and
fixed them in three priority waves (3 High, 17 Medium, 52 Low).

Highs
- Server-017: enumerate AcknowledgeAlarm / QueryActiveAlarms in
  GatewayGrpcScopeResolver so non-admin keys can use them; document
  the mapping in docs/Authorization.md; add interceptor tests.
- Client.Java-013: add the five missing bulk-method stubs to the
  CLI FakeSession so the test module compiles on a clean tree.
- Client.Rust-013: fix the clippy::doc_lazy_continuation regression
  in generated tonic code by reformatting the ReadBulkCommand proto
  comment and scoping a #![allow(...)] to the generated submodules.

Mediums (highlights)
- Server: unify GatewaySession state-lock discipline (-015) and
  make DisposeAsync race-safe against in-flight CloseAsync (-016);
  add constraint-enforcement test coverage for the bulk-plan path
  (-021).
- Worker: introduce StaRuntimeShutdownException so RunAlarmPollLoop
  can distinguish graceful shutdown from a real STA-affinity
  violation (-016); have the watchdog skip StaHung while
  CurrentCommandCorrelationId is non-empty so a legitimate slow
  ReadBulk no longer self-faults (-017).
- Tests: add per-method round-trip + cancellation coverage for the
  11 GatewaySession bulk methods (-013); replace the real TCP probe
  in GalaxyHierarchyCacheTests with an IGalaxyRepository fake
  (-016).
- IntegrationTests: drive the StreamEvents writer in the live Write
  test and assert OnWriteComplete (-012); add live tests for
  Unadvise/RemoveItem/Unregister ordering, WriteSecured, and
  abnormal worker exit (-014).
- Worker.Tests: replace MxAccessSession reflection with an internal
  CreateForTesting factory (-016); cover WorkerCancel and
  unexpected-body envelope branches (-017).
- Client.Java: cancel MxEventStream when close() races
  beforeStart() (-014); return a CancellingCompletableFuture that
  actually forwards cancellation through .thenApply chains (-015).
- Client.Python: drop the silent localhost-plaintext downgrade in
  the CLI; require explicit --plaintext (-013).
- Client.Rust: stop bench-read-bulk from polluting success-latency
  histograms with failed-call durations (-015); add coverage for
  the five MalformedReply paths, the bulk-write helpers, the
  Error::Unavailable mapping, and the unary-fault path (-016).
- Contracts: extend docs/Contracts.md with the bulk read/write
  command family (-009).

Lows (highlights)
- Server: cap GalaxyGlobMatcher.RegexCache; align
  WorkerAlarmRpcDispatcher missing-session handling; drop the
  duplicate dashboard @page routes; refresh IAlarmRpcDispatcher
  XML doc.
- Worker: surface SetXmlAlarmQuery COM failures; remove dead
  subscriptionExpression / ExecutingCommand arms; preserve
  factory-supplied runtime sessions; split MxAlarmSnapshot.cs into
  three files.
- Tests: dispose the WebApplication in seven test classes; rebuild
  FakeWorkerProcess.WaitForExitAsync against a real TaskCompletion
  source; switch the heartbeat-expires test to ManualTimeProvider;
  add InvariantCulture to the remaining DateTimeOffset.Parse sites;
  document GalaxyFilterInputSafetyTests in GatewayTesting.md.
- IntegrationTests: comment fixes, RecordingServerStreamWriter
  IDisposable, class-level [Trait], single-source ZB default
  connection string.
- Worker.Tests: replace silent-return gating with LiveMxAccessFact
  so absent env vars SKIP not pass; PascalCase rename of probe
  [Fact]s; deterministic deadline test; new frame-protocol error
  tests; ComputeTransitions diff-coverage; relocate dev-rig probes
  to Probes/.
- Contracts: add round-trip coverage and per-field redaction /
  Galaxy-identifier comments to the protos.
- Client.Dotnet: introduce clients/dotnet/Directory.Build.props so
  TreatWarningsAsErrors / analysers apply; document
  DiscoverHierarchyOptions and IMxGatewayCliClient; require typed
  bulk-read handles in CLI; surface AcknowledgeAlarm transport
  faults through Translate().
- Client.Go: kill dead code in alarms_test / fakeGalaxyServer /
  runWriteBulkVariant; document the six new subcommands in
  writeUsage; drain galaxy-watch events on limit; switch io.EOF
  comparisons to errors.Is.
- Client.Java: shared shutdown helpers + new shutdownTimeout
  option; regex-based credential redaction; Long.toUnsignedString
  for uint64 sequence; doc fixes.
- Client.Python: combine duplicate imports; add coverage for
  _percentile / bench-read-bulk / MAX_AGGREGATE_EVENTS /
  _api_key_from_env; populate pyproject metadata and ship py.typed.
- Client.Rust: expose next_correlation_id() so CLI ping/close
  stop hard-coding correlation IDs; resync RustClientDesign.md
  with the current Session / Error surface and CLI subcommand set.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
Joseph Doherty
2026-05-20 09:46:47 -04:00
parent 1cd51bbda3
commit a0203503a7
122 changed files with 8723 additions and 757 deletions
+23 -12
View File
@@ -389,25 +389,30 @@ func runReadBulk(ctx context.Context, args []string, stdout, stderr io.Writer) e
}
func runWriteBulk(ctx context.Context, args []string, stdout, stderr io.Writer) error {
return runWriteBulkVariant(ctx, args, stdout, stderr, "write-bulk", false, false)
return runWriteBulkVariant(ctx, args, stdout, stderr, "write-bulk", false)
}
func runWrite2Bulk(ctx context.Context, args []string, stdout, stderr io.Writer) error {
return runWriteBulkVariant(ctx, args, stdout, stderr, "write2-bulk", true, false)
return runWriteBulkVariant(ctx, args, stdout, stderr, "write2-bulk", true)
}
func runWriteSecuredBulk(ctx context.Context, args []string, stdout, stderr io.Writer) error {
return runWriteBulkVariant(ctx, args, stdout, stderr, "write-secured-bulk", false, true)
return runWriteBulkVariant(ctx, args, stdout, stderr, "write-secured-bulk", false)
}
func runWriteSecured2Bulk(ctx context.Context, args []string, stdout, stderr io.Writer) error {
return runWriteBulkVariant(ctx, args, stdout, stderr, "write-secured2-bulk", true, true)
return runWriteBulkVariant(ctx, args, stdout, stderr, "write-secured2-bulk", true)
}
// runWriteBulkVariant shares the flag-parsing + entry-build skeleton across
// the four bulk-write families. withTimestamp adds a --timestamp-value flag;
// secured switches from --user-id to --current-user-id / --verifier-user-id.
func runWriteBulkVariant(ctx context.Context, args []string, stdout, stderr io.Writer, command string, withTimestamp bool, secured bool) error {
// the four bulk-write families. command selects which of the four routes
// runs; withTimestamp adds a --timestamp-value flag for the Write2 / Secured2
// variants. Secured-only flags (--current-user-id / --verifier-user-id) are
// only registered for the secured variants and the non-secured -user-id flag
// is only registered for Write/Write2, so a wrong-variant flag becomes a
// clean "flag provided but not defined" error instead of silently no-op'ing.
func runWriteBulkVariant(ctx context.Context, args []string, stdout, stderr io.Writer, command string, withTimestamp bool) error {
secured := command == "write-secured-bulk" || command == "write-secured2-bulk"
flags := flag.NewFlagSet(command, flag.ContinueOnError)
flags.SetOutput(stderr)
common := bindCommonFlags(flags)
@@ -417,9 +422,13 @@ func runWriteBulkVariant(ctx context.Context, args []string, stdout, stderr io.W
itemHandles := flags.String("item-handles", "", "comma-separated item handles")
valueType := flags.String("type", "string", "value type: bool, int32, int64, float, double, string")
values := flags.String("values", "", "comma-separated values (one per item handle)")
userID := flags.Int("user-id", 0, "MXAccess user id (Write/Write2 variants)")
currentUserID := flags.Int("current-user-id", 0, "MXAccess current user id (Secured variants)")
verifierUserID := flags.Int("verifier-user-id", 0, "MXAccess verifier user id (Secured variants)")
var userID, currentUserID, verifierUserID *int
if secured {
currentUserID = flags.Int("current-user-id", 0, "MXAccess current user id (Secured variants)")
verifierUserID = flags.Int("verifier-user-id", 0, "MXAccess verifier user id (Secured variants)")
} else {
userID = flags.Int("user-id", 0, "MXAccess user id (Write/Write2 variants)")
}
timestampValue := flags.String("timestamp-value", "", "RFC 3339 timestamp shared across all entries (Write2/WriteSecured2 variants)")
if err := flags.Parse(args); err != nil {
@@ -507,7 +516,6 @@ func runWriteBulkVariant(ctx context.Context, args []string, stdout, stderr io.W
default:
return fmt.Errorf("unsupported bulk write command %q", command)
}
_ = secured // currently only used for routing above; reserved for future per-variant validation
return writeWriteBulkOutput(stdout, *jsonOutput, command, options, results, err)
}
@@ -1061,7 +1069,7 @@ type protojsonMessage interface {
}
func writeUsage(writer io.Writer) {
fmt.Fprintln(writer, "usage: mxgw-go <version|open-session|close-session|register|add-item|advise|subscribe-bulk|unsubscribe-bulk|write|stream-events|smoke|galaxy-test-connection|galaxy-last-deploy|galaxy-discover|galaxy-watch>")
fmt.Fprintln(writer, "usage: mxgw-go <version|open-session|close-session|register|add-item|advise|subscribe-bulk|unsubscribe-bulk|read-bulk|write-bulk|write2-bulk|write-secured-bulk|write-secured2-bulk|bench-read-bulk|write|stream-events|smoke|galaxy-test-connection|galaxy-last-deploy|galaxy-discover|galaxy-watch>")
}
func dialGalaxyForCommand(ctx context.Context, common *commonOptions) (*mxgateway.GalaxyClient, commonOptions, error) {
@@ -1245,6 +1253,9 @@ func runGalaxyWatch(ctx context.Context, args []string, stdout, stderr io.Writer
count++
if *limit > 0 && count >= *limit {
cancelStream()
// Allow goroutine to drain.
for range events {
}
return nil
}
case streamErr, ok := <-errs: