Files
mxaccessgw/clients/dotnet/ZB.MOM.WW.MxGateway.Client/MxGatewayClientRetryPolicy.cs
T
Joseph Doherty 397d3c5c4f rename: apply ZB.MOM.WW prefix to all client SDKs + fix pre-existing alarm-RPC breaks
Rename across every client surface using each language's idiomatic convention:

  * .NET   clients/dotnet/MxGateway.Client[.Cli|.Tests]/
             -> clients/dotnet/ZB.MOM.WW.MxGateway.Client[.Cli|.Tests]/
             namespaces -> ZB.MOM.WW.MxGateway.Client[.Cli|.Tests]
             contracts ProjectReference repointed to ZB.MOM.WW.MxGateway.Contracts
             sln migrated to slnx (dotnet sln migrate)
  * Python src/mxgateway -> src/zb_mom_ww_mxgateway
             src/mxgateway_cli -> src/zb_mom_ww_mxgateway_cli
             distribution: mxaccess-gateway-client -> zb-mom-ww-mxaccess-gateway-client
  * Rust   crate: mxgateway-client -> zb-mom-ww-mxgateway-client
             build.rs proto path repointed
  * Java   subprojects: mxgateway-{client,cli} -> zb-mom-ww-mxgateway-{client,cli}
             packages com.dohertylan.mxgateway -> com.zb.mom.ww.mxgateway
             group   com.dohertylan.mxgateway -> com.zb.mom.ww.mxgateway
             rootProject mxaccessgw-java -> zb-mom-ww-mxaccessgw-java
  * Go     generate-proto.ps1 proto path repointed; module path and
             package mxgateway kept (Go convention).
  * proto-inputs.json: generatedOutputs.python updated to new package path.
  * scripts/run-client-e2e-tests.ps1: Java CLI install path + gradle task
             updated to zb-mom-ww-mxgateway-cli.

CLI binary names (mxgw, mxgw-py, mxgw-go, mxgateway-cli) and wire-level
identifiers (MXGATEWAY_* env vars, the mxgw_<id>_<secret> API key
prefix, protobuf package names like mxaccess_gateway.v1, all MXAccess
references) intentionally NOT renamed.

Fix pre-existing alarms-over-gateway breaks unblocked by the rename:

  * mxaccess_gateway.proto: add missing public message QueryActiveAlarmsRequest
    {session_id, client_correlation_id, alarm_filter_prefix} and missing
    rpc QueryActiveAlarms(QueryActiveAlarmsRequest) returns
    (stream ActiveAlarmSnapshot). All four typed clients referenced
    these but they were absent from the proto.
  * MxAccessGatewayService.QueryActiveAlarms: implement the new RPC on
    the server, streaming from IGatewayAlarmService.CurrentAlarms with
    optional alarm_filter_prefix filter.
  * clients/dotnet/.../DiscoverHierarchyOptions.cs: add the hand-written
    .NET POCO that wraps DiscoverHierarchyRequest (referenced by
    GalaxyRepositoryClient.DiscoverHierarchyAsync but never authored).
  * Drop retired session_id field references from
    AcknowledgeAlarmRequest/AcknowledgeAlarmReply test fixtures across
    .NET, Rust, Go, and Python clients.
  * Rust integration test: add the missing stream_alarms impl on the
    fake MxAccessGateway server (the trait gained the method, fake
    didn't).
  * Rust CLI test: bump expected gatewayProtocolVersion 2 -> 3.

Regenerated artifacts updated in this commit:
  * src/ZB.MOM.WW.MxGateway.Contracts/Generated/{MxaccessGateway,MxaccessGatewayGrpc}.cs
  * clients/python/src/zb_mom_ww_mxgateway/generated/*_pb2{,_grpc}.py
  * clients/go/internal/generated/*.pb.go
(C# regenerated by Grpc.Tools on contracts build; Python and Go via
their generate-proto.ps1 scripts; Rust regenerates from .proto via
tonic-build at compile time so no checked-in artefact.)

Verification: 472 server tests, 275 worker tests (9 dev-rig skipped),
18 integration tests (live MxAccess + LDAP + Galaxy), 57 .NET client
tests, 32 Rust workspace tests, 39 Python tests, all Go packages, and
gradle build for Java all pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-23 19:09:34 -04:00

69 lines
2.6 KiB
C#

using Grpc.Core;
using Microsoft.Extensions.Logging;
using ZB.MOM.WW.MxGateway.Contracts.Proto;
using Polly;
using Polly.Retry;
namespace ZB.MOM.WW.MxGateway.Client;
/// <summary>Factory and helpers for exponential-backoff retry policies on transient gRPC failures.</summary>
internal static class MxGatewayClientRetryPolicy
{
/// <summary>Creates a Polly ResiliencePipeline that retries transient gRPC failures with exponential backoff.</summary>
/// <param name="options">Retry configuration (max attempts, delay bounds, jitter).</param>
/// <param name="logger">Optional logger for retry diagnostics.</param>
public static ResiliencePipeline Create(
MxGatewayClientRetryOptions options,
ILogger? logger)
{
ArgumentNullException.ThrowIfNull(options);
options.Validate();
return new ResiliencePipelineBuilder()
.AddRetry(new RetryStrategyOptions
{
MaxRetryAttempts = Math.Max(0, options.MaxAttempts - 1),
BackoffType = DelayBackoffType.Exponential,
UseJitter = options.UseJitter,
Delay = options.Delay,
MaxDelay = options.MaxDelay,
ShouldHandle = new PredicateBuilder().Handle<Exception>(IsTransientGrpcFailure),
OnRetry = args =>
{
logger?.LogDebug(
args.Outcome.Exception,
"Retrying MXAccess Gateway client call after transient gRPC failure. Attempt {Attempt}.",
args.AttemptNumber + 1);
return default;
},
})
.Build();
}
/// <summary>Returns whether a command kind is eligible for automatic retry on transient failures.</summary>
/// <param name="kind">The command kind to check.</param>
public static bool IsRetryableCommand(MxCommandKind kind)
{
return kind is MxCommandKind.Ping
or MxCommandKind.GetSessionState
or MxCommandKind.GetWorkerInfo;
}
private static bool IsTransientGrpcFailure(Exception exception)
{
return exception switch
{
RpcException rpcException => IsTransientStatus(rpcException.StatusCode),
MxGatewayException { InnerException: RpcException rpcException } => IsTransientStatus(rpcException.StatusCode),
_ => false,
};
}
private static bool IsTransientStatus(StatusCode statusCode)
{
return statusCode is StatusCode.Unavailable
or StatusCode.DeadlineExceeded
or StatusCode.ResourceExhausted;
}
}