fix(management-service): resolve ManagementService-005,008,010,011 — supervision strategy, configured command timeout, remove stale ResolveRoles path; ManagementService-012 deferred

This commit is contained in:
Joseph Doherty
2026-05-16 22:24:03 -04:00
parent 858fe24add
commit dab0056d1b
6 changed files with 200 additions and 32 deletions

View File

@@ -8,7 +8,7 @@
| Last reviewed | 2026-05-16 | | Last reviewed | 2026-05-16 |
| Reviewer | claude-agent | | Reviewer | claude-agent |
| Commit reviewed | `9c60592` | | Commit reviewed | `9c60592` |
| Open findings | 5 | | Open findings | 0 (1 Deferred — see ManagementService-012) |
## Summary ## Summary
@@ -211,7 +211,7 @@ mapping tests confirm behaviour is preserved.
|--|--| |--|--|
| Severity | Low | | Severity | Low |
| Category | Akka.NET conventions | | Category | Akka.NET conventions |
| Status | Open | | Status | Resolved |
| Location | `src/ScadaLink.ManagementService/ManagementActor.cs:33` | | Location | `src/ScadaLink.ManagementService/ManagementActor.cs:33` |
**Description** **Description**
@@ -230,7 +230,15 @@ Resume-based strategy, consistent with other central coordinator actors.
**Resolution** **Resolution**
_Unresolved._ Resolved 2026-05-16 (commit pending). Confirmed: `ManagementActor` declared no
`SupervisorStrategy`. Added a `public static SupervisorStrategy CreateSupervisorStrategy()`
factory returning an unbounded `OneForOneStrategy` with a `Directive.Resume` decider, and a
`protected override SupervisorStrategy()` that delegates to it — matching the Resume-based
convention of `CentralCommunicationActor`/`SiteCommunicationActor`. The actor spawns no
children today, so this is a forward-looking correctness fix. Regression tests:
`CreateSupervisorStrategy_ReturnsOneForOneStrategy`,
`CreateSupervisorStrategy_ResumesOnArbitraryException`,
`CreateSupervisorStrategy_ResumesIndefinitely` (new `ManagementActorSupervisionTests.cs`).
### ManagementService-006 — JsonDocument instances never disposed in the HTTP endpoint ### ManagementService-006 — JsonDocument instances never disposed in the HTTP endpoint
@@ -312,7 +320,7 @@ Regression tests: `SerializeResult_WithCyclicGraph_DoesNotThrow`,
|--|--| |--|--|
| Severity | Low | | Severity | Low |
| Category | Correctness & logic bugs | | Category | Correctness & logic bugs |
| Status | Open | | Status | Resolved |
| Location | `src/ScadaLink.ManagementService/ManagementActor.cs:285` | | Location | `src/ScadaLink.ManagementService/ManagementActor.cs:285` |
**Description** **Description**
@@ -329,7 +337,15 @@ Resolve `RoleMapper` via `sp.GetRequiredService<RoleMapper>()` like every other
**Resolution** **Resolution**
_Unresolved._ Resolved 2026-05-16 (commit pending). Confirmed: `HandleResolveRoles` did
`new RoleMapper(sp.GetRequiredService<ISecurityRepository>())`, bypassing the
`AddScoped<RoleMapper>()` DI registration. The hand-built `RoleMapper` lived only inside
`HandleResolveRoles`, which is itself the dead-code dispatch path removed under finding 011
(the two-step ResolveRoles flow is retired). Resolving 011 by deleting the
`ResolveRolesCommand` dispatch case and `HandleResolveRoles` handler also removes the only
manually-constructed `RoleMapper` in the module, so the DI-bypass no longer exists. No
remaining `new RoleMapper` in `src/ScadaLink.ManagementService`. Regression covered by
`ResolveRolesCommand_IsNoLongerDispatched_ReturnsManagementError`.
### ManagementService-009 — Audit logging applied inconsistently across mutating handlers ### ManagementService-009 — Audit logging applied inconsistently across mutating handlers
@@ -385,7 +401,7 @@ covers the explicit-audit path.
|--|--| |--|--|
| Severity | Low | | Severity | Low |
| Category | Design-document adherence | | Category | Design-document adherence |
| Status | Open | | Status | Resolved |
| Location | `src/ScadaLink.ManagementService/ManagementServiceOptions.cs:5`; `src/ScadaLink.ManagementService/ManagementEndpoints.cs:16` | | Location | `src/ScadaLink.ManagementService/ManagementServiceOptions.cs:5`; `src/ScadaLink.ManagementService/ManagementEndpoints.cs:16` |
**Description** **Description**
@@ -405,7 +421,15 @@ configuration cannot be set with no effect.
**Resolution** **Resolution**
_Unresolved._ Resolved 2026-05-16 (commit pending). Confirmed: `ManagementEndpoints` hard-coded
`AskTimeout = TimeSpan.FromSeconds(30)` and never read `ManagementServiceOptions.CommandTimeout`.
`HandleRequest` now resolves `IOptions<ManagementServiceOptions>` from `context.RequestServices`
and computes the Ask timeout via a new `ManagementEndpoints.ResolveAskTimeout` helper, which
returns the configured `CommandTimeout` when strictly positive and otherwise falls back to the
30s default (guarding against a misconfigured zero/negative value that would fail every call).
Regression tests: `ResolveAskTimeout_UsesConfiguredCommandTimeout`,
`ResolveAskTimeout_WithNullOptions_FallsBackToDefault`,
`ResolveAskTimeout_WithNonPositiveTimeout_FallsBackToDefault`.
### ManagementService-011 — ResolveRolesCommand dispatch path is stale dead code ### ManagementService-011 — ResolveRolesCommand dispatch path is stale dead code
@@ -413,7 +437,7 @@ _Unresolved._
|--|--| |--|--|
| Severity | Low | | Severity | Low |
| Category | Correctness & logic bugs | | Category | Correctness & logic bugs |
| Status | Open | | Status | Resolved |
| Location | `src/ScadaLink.ManagementService/ManagementActor.cs:273`, `:283` | | Location | `src/ScadaLink.ManagementService/ManagementActor.cs:273`, `:283` |
**Description** **Description**
@@ -435,7 +459,19 @@ data unauthenticated is intended.
**Resolution** **Resolution**
_Unresolved._ Resolved 2026-05-16 (commit pending). Confirmed dead path: a repository-wide search found no
`ResolveRolesCommand` sender outside `ManagementActor` itself — the CLI and HTTP endpoint
perform LDAP auth + role resolution inline. Removed the `ResolveRolesCommand` dispatch case
and the `HandleResolveRoles` handler from `ManagementActor`; a stray ClusterClient sender now
falls through to the `NotSupportedException` default and gets a uniform `ManagementError`
(closing the unauthenticated role-mapping enumeration surface, since `GetRequiredRole`
returned null for it). A code comment at the former dispatch site documents the intentional
omission. Note: the `ResolveRolesCommand` *record* itself lives in
`src/ScadaLink.Commons/Messages/Management/SecurityCommands.cs` and was left in place — that
file is outside this module's permitted edit scope; deleting the orphan record should be done
as a Commons-module follow-up. With the handler removed it is now an inert,
registry-only type with no behaviour. Regression test:
`ResolveRolesCommand_IsNoLongerDispatched_ReturnsManagementError`.
### ManagementService-012 — ManagementEnvelope carries a loosely-typed object payload ### ManagementService-012 — ManagementEnvelope carries a loosely-typed object payload
@@ -443,7 +479,7 @@ _Unresolved._
|--|--| |--|--|
| Severity | Low | | Severity | Low |
| Category | Akka.NET conventions | | Category | Akka.NET conventions |
| Status | Open | | Status | Deferred |
| Location | `src/ScadaLink.Commons/Messages/Management/ManagementEnvelope.cs:7`; `src/ScadaLink.ManagementService/ManagementActor.cs:132` | | Location | `src/ScadaLink.Commons/Messages/Management/ManagementEnvelope.cs:7`; `src/ScadaLink.ManagementService/ManagementActor.cs:132` |
**Description** **Description**
@@ -463,7 +499,16 @@ flag unhandled cases, and keeps `ManagementCommandRegistry`'s reflection scan pr
**Resolution** **Resolution**
_Unresolved._ Deferred 2026-05-16. Finding verified as genuine: `ManagementEnvelope.Command` is typed
`object` and the recommended `IManagementCommand` marker-interface fix is sound. However, the
fix cannot be implemented within the `ManagementService` module: both `ManagementEnvelope` and
all ~50 `*Command` records live in `src/ScadaLink.Commons/Messages/Management/` (17 files),
which is outside this work item's permitted edit scope (`src/ScadaLink.ManagementService/**`,
its tests, and this findings file only). Adding the marker interface, retyping the envelope,
and having `ManagementCommandRegistry` constrain its reflection scan to `IManagementCommand`
implementers is a cohesive Commons-module change and must be done there — also so the Commons
message-contract additive-only evolution rules are respected. Deferred to a Commons-module
work item; no `ManagementService`-local change is appropriate.
### ManagementService-013 — No tests for site-scope enforcement, the HTTP endpoint, or DebugStreamHub ### ManagementService-013 — No tests for site-scope enforcement, the HTTP endpoint, or DebugStreamHub

View File

@@ -43,6 +43,22 @@ public class ManagementActor : ReceiveActor
Receive<ManagementEnvelope>(HandleEnvelope); Receive<ManagementEnvelope>(HandleEnvelope);
} }
/// <summary>
/// Builds the supervision strategy for <see cref="ManagementActor"/>. Per the project's
/// Akka.NET conventions, coordinator-style actors use a Resume-based strategy so a faulted
/// child preserves its state rather than restarting. <see cref="ManagementActor"/> spawns
/// no children today, but declaring the strategy explicitly matches the convention and
/// makes the contract correct ahead of any future worker actors (finding
/// ManagementService-005).
/// </summary>
public static SupervisorStrategy CreateSupervisorStrategy() =>
new OneForOneStrategy(
maxNrOfRetries: -1,
withinTimeRange: System.Threading.Timeout.InfiniteTimeSpan,
decider: Decider.From(_ => Directive.Resume));
protected override SupervisorStrategy SupervisorStrategy() => CreateSupervisorStrategy();
/// <summary> /// <summary>
/// Serializer settings for command results. <see cref="ReferenceHandler.IgnoreCycles"/> /// Serializer settings for command results. <see cref="ReferenceHandler.IgnoreCycles"/>
/// keeps EF-backed entity graphs with bidirectional navigation properties from throwing; /// keeps EF-backed entity graphs with bidirectional navigation properties from throwing;
@@ -304,29 +320,16 @@ public class ManagementActor : ReceiveActor
DiscardParkedMessageCommand cmd => await HandleDiscardParkedMessage(sp, cmd, user), DiscardParkedMessageCommand cmd => await HandleDiscardParkedMessage(sp, cmd, user),
DebugSnapshotCommand cmd => await HandleDebugSnapshot(sp, cmd, user), DebugSnapshotCommand cmd => await HandleDebugSnapshot(sp, cmd, user),
// Role resolution (for CLI LDAP auth) // NOTE: ResolveRolesCommand is intentionally NOT dispatched. The two-step
ResolveRolesCommand cmd => await HandleResolveRoles(sp, cmd), // "ResolveRoles + command" flow is retired — the HTTP endpoint performs LDAP
// auth and role resolution itself before sending a single envelope. Leaving a
// handler would expose role-mapping data to any raw ClusterClient sender with
// no role requirement; the command now falls through to the default below
// (finding ManagementService-011).
_ => throw new NotSupportedException($"Unknown command type: {command.GetType().Name}") _ => throw new NotSupportedException($"Unknown command type: {command.GetType().Name}")
}; };
} }
// ========================================================================
// Role resolution
// ========================================================================
private static async Task<object?> HandleResolveRoles(IServiceProvider sp, ResolveRolesCommand cmd)
{
var roleMapper = new RoleMapper(sp.GetRequiredService<ISecurityRepository>());
var result = await roleMapper.MapGroupsToRolesAsync(cmd.LdapGroups);
return new
{
Roles = result.Roles,
PermittedSiteIds = result.PermittedSiteIds,
IsSystemWideDeployment = result.IsSystemWideDeployment
};
}
// ======================================================================== // ========================================================================
// Site-scope enforcement // Site-scope enforcement
// ======================================================================== // ========================================================================

View File

@@ -6,6 +6,7 @@ using Microsoft.AspNetCore.Http;
using Microsoft.AspNetCore.Routing; using Microsoft.AspNetCore.Routing;
using Microsoft.Extensions.DependencyInjection; using Microsoft.Extensions.DependencyInjection;
using Microsoft.Extensions.Logging; using Microsoft.Extensions.Logging;
using Microsoft.Extensions.Options;
using ScadaLink.Commons.Messages.Management; using ScadaLink.Commons.Messages.Management;
using ScadaLink.Security; using ScadaLink.Security;
@@ -13,7 +14,20 @@ namespace ScadaLink.ManagementService;
public static class ManagementEndpoints public static class ManagementEndpoints
{ {
private static readonly TimeSpan AskTimeout = TimeSpan.FromSeconds(30); private static readonly TimeSpan DefaultAskTimeout = TimeSpan.FromSeconds(30);
/// <summary>
/// Resolves the ManagementActor Ask timeout from configuration
/// (finding ManagementService-010). Falls back to <see cref="DefaultAskTimeout"/>
/// when options are absent or the configured value is not strictly positive — a
/// zero/negative timeout would make every management call fail immediately.
/// </summary>
public static TimeSpan ResolveAskTimeout(ManagementServiceOptions? options)
{
if (options is { CommandTimeout: { Ticks: > 0 } configured })
return configured;
return DefaultAskTimeout;
}
public static IEndpointRouteBuilder MapManagementAPI(this IEndpointRouteBuilder endpoints) public static IEndpointRouteBuilder MapManagementAPI(this IEndpointRouteBuilder endpoints)
{ {
@@ -101,10 +115,13 @@ public static class ManagementEndpoints
var correlationId = Guid.NewGuid().ToString("N"); var correlationId = Guid.NewGuid().ToString("N");
var envelope = new ManagementEnvelope(authenticatedUser, command, correlationId); var envelope = new ManagementEnvelope(authenticatedUser, command, correlationId);
var askTimeout = ResolveAskTimeout(
context.RequestServices.GetService<IOptions<ManagementServiceOptions>>()?.Value);
object response; object response;
try try
{ {
response = await holder.ActorRef.Ask(envelope, AskTimeout); response = await holder.ActorRef.Ask(envelope, askTimeout);
} }
catch (Exception ex) catch (Exception ex)
{ {

View File

@@ -0,0 +1,39 @@
using Akka.Actor;
using ScadaLink.ManagementService;
namespace ScadaLink.ManagementService.Tests;
/// <summary>
/// Tests for the <see cref="ManagementActor"/> supervision strategy
/// (finding ManagementService-005). The project convention is that long-lived
/// coordinator-style actors declare an explicit Resume-based strategy.
/// </summary>
public class ManagementActorSupervisionTests
{
[Fact]
public void CreateSupervisorStrategy_ReturnsOneForOneStrategy()
{
var strategy = ManagementActor.CreateSupervisorStrategy();
Assert.IsType<OneForOneStrategy>(strategy);
}
[Fact]
public void CreateSupervisorStrategy_ResumesOnArbitraryException()
{
var strategy = (OneForOneStrategy)ManagementActor.CreateSupervisorStrategy();
var directive = strategy.Decider.Decide(new InvalidOperationException("boom"));
Assert.Equal(Directive.Resume, directive);
}
[Fact]
public void CreateSupervisorStrategy_ResumesIndefinitely()
{
var strategy = (OneForOneStrategy)ManagementActor.CreateSupervisorStrategy();
// Coordinator actors should not give up: unbounded retries.
Assert.Equal(-1, strategy.MaxNumberOfRetries);
}
}

View File

@@ -715,6 +715,33 @@ public class ManagementActorTests : TestKit, IDisposable
// continuation rather than escaping or being silently dropped. // continuation rather than escaping or being silently dropped.
// ======================================================================== // ========================================================================
// ========================================================================
// ResolveRolesCommand dead-code removal (finding ManagementService-011 / -008)
//
// The two-step ResolveRoles + command flow is retired: the HTTP endpoint does
// LDAP auth and role resolution itself. The actor must no longer dispatch
// ResolveRolesCommand — a stray ClusterClient sender hitting it gets a uniform
// ManagementError rather than an unauthenticated role-mapping enumeration.
// ========================================================================
[Fact]
public void ResolveRolesCommand_IsNoLongerDispatched_ReturnsManagementError()
{
var secRepo = Substitute.For<ISecurityRepository>();
_services.AddScoped(_ => secRepo);
var actor = CreateActor();
var envelope = Envelope(new ResolveRolesCommand(new[] { "cn=admins" }));
actor.Tell(envelope);
var response = ExpectMsg<ManagementError>(TimeSpan.FromSeconds(5));
Assert.Equal(envelope.CorrelationId, response.CorrelationId);
Assert.Equal("COMMAND_FAILED", response.ErrorCode);
// No role-mapping data is enumerated/leaked back to the caller.
secRepo.DidNotReceiveWithAnyArgs().GetAllMappingsAsync(default);
}
[Fact] [Fact]
public void UnknownCommandType_FaultMappedToManagementError() public void UnknownCommandType_FaultMappedToManagementError()
{ {

View File

@@ -62,4 +62,41 @@ public class ManagementEndpointsTests
Assert.False(result.Success); Assert.False(result.Success);
Assert.Equal("BAD_REQUEST", result.ErrorCode); Assert.Equal("BAD_REQUEST", result.ErrorCode);
} }
// ========================================================================
// Command-timeout configuration (finding ManagementService-010)
//
// ManagementServiceOptions.CommandTimeout must actually drive the Ask
// timeout instead of a hard-coded 30s constant.
// ========================================================================
[Fact]
public void ResolveAskTimeout_UsesConfiguredCommandTimeout()
{
var options = new ManagementServiceOptions { CommandTimeout = TimeSpan.FromSeconds(75) };
var timeout = ManagementEndpoints.ResolveAskTimeout(options);
Assert.Equal(TimeSpan.FromSeconds(75), timeout);
}
[Fact]
public void ResolveAskTimeout_WithNullOptions_FallsBackToDefault()
{
var timeout = ManagementEndpoints.ResolveAskTimeout(null);
Assert.Equal(TimeSpan.FromSeconds(30), timeout);
}
[Fact]
public void ResolveAskTimeout_WithNonPositiveTimeout_FallsBackToDefault()
{
// A misconfigured zero/negative timeout would make every Ask fail
// immediately; fall back to the safe default instead.
var options = new ManagementServiceOptions { CommandTimeout = TimeSpan.Zero };
var timeout = ManagementEndpoints.ResolveAskTimeout(options);
Assert.Equal(TimeSpan.FromSeconds(30), timeout);
}
} }