Compare commits

...

9 Commits

Author SHA1 Message Date
Joseph Doherty
f24f969a85 Phase 2 PR 12 — richer historian quality mapping. Replace MxAccessGalaxyBackend's inline MapHistorianQualityToOpcUa category-only helper (192+→Good, 64-191→Uncertain, 0-63→Bad) with a new public HistorianQualityMapper.Map utility that preserves specific OPC DA subcodes — BadNotConnected(8)→0x808A0000u instead of generic Bad(0x80000000u), UncertainSubNormal(88)→0x40950000u instead of generic Uncertain, Good_LocalOverride(216)→0x00D80000u instead of generic Good, etc. Mirrors v1 QualityMapper.MapToOpcUaStatusCode byte-for-byte without pulling in OPC UA types — the function returns uint32 literals that are the canonical OPC UA StatusCode wire encoding, surfaced directly as DataValueSnapshot.StatusCode on the Proxy side with no additional translation. Unknown subcodes fall back to the family category (255→Good, 150→Uncertain, 50→Bad) so a future SDK change that adds a quality code we don't map yet still gets a sensible bucket. GalaxyDataValue wire shape unchanged (StatusCode stays uint) — this is a pure fidelity upgrade on the Host side. Downstream callers (Admin UI status dashboard, OPC UA clients receiving historian samples) can now distinguish e.g. a transport outage (BadNotConnected) from a sensor fault (BadSensorFailure) from a warm-up delay (BadWaitingForInitialData) without a second round-trip or dashboard heuristic. 21 new tests (HistorianQualityMapperTests): theory with 15 rows covering every specific mapping from the v1 QualityMapper table, plus 6 fallback tests verifying unknown-subcode codes in each family (Good/Uncertain/Bad) collapse to the family default. Galaxy.Host.Tests Unit suite 56/0 (21 new + 35 existing). Galaxy.Host builds clean (0/0). Branches off v2.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 07:11:02 -04:00
d2ebb91cb1 Merge pull request 'Phase 2 PR 9 — thread IsAlarm discovery flag end-to-end' (#8) from phase-2-pr9-alarms into v2 2026-04-18 06:59:25 -04:00
90ce0af375 Merge pull request 'Phase 2 PR 8 — gateway-level host-status push from MxAccessGalaxyBackend' (#7) from phase-2-pr8-alarms-hoststatus into v2 2026-04-18 06:59:04 -04:00
e250356e2a Merge pull request 'Phase 2 PR 7 — wire IHistoryProvider.ReadProcessedAsync end-to-end' (#6) from phase-2-pr7-history-processed into v2 2026-04-18 06:59:02 -04:00
067ad78e06 Merge pull request 'Phase 2 PR 6 — close PR 4 monitor-loop low findings (probe leak + replay signal)' (#5) from phase-2-pr6-monitor-findings into v2 2026-04-18 06:57:57 -04:00
6cfa8d326d Merge pull request 'Phase 2 PR 4 — close 4 open MXAccess findings (push frames + reconnect + write-await + read-cancel)' (#3) from phase-2-pr4-findings into v2 2026-04-18 06:57:21 -04:00
Joseph Doherty
70a5d06b37 Phase 2 PR 9 — thread IsAlarm discovery flag end-to-end. GalaxyRepository.GetAttributesAsync has always emitted is_alarm alongside is_historized (CASE WHEN EXISTS with the primitive_definition join on primitive_name='AlarmExtension' per v1's Extended Attributes SQL lifted byte-for-byte into the PR 5 repository port), and GalaxyAttributeRow.IsAlarm has been populated since the port, but the flag was silently dropped at the MapAttribute helper in both MxAccessGalaxyBackend and DbBackedGalaxyBackend because GalaxyAttributeInfo on the IPC side had no field to carry it — every deployed alarm attribute arrived at the Proxy with no signal that it was alarm-bearing. This PR wires the flag through the three translation boundaries: GalaxyAttributeInfo gains [Key(6)] public bool IsAlarm { get; set; } at the end of the message to preserve wire-compat with pre-PR9 payloads that omit the key (MessagePack treats missing keys as default, so a newer Proxy talking to an older Host simply gets IsAlarm=false for every attribute); both backend MapAttribute helpers copy row.IsAlarm into the IPC shape; DriverAttributeInfo in Core.Abstractions gains a new IsAlarm parameter with default value false so the positional record signature change doesn't force every non-Galaxy driver call site to flow a flag they don't produce (the existing generic node-manager and future Modbus/etc. drivers keep compiling without modification); GalaxyProxyDriver.DiscoverAsync passes attr.IsAlarm through to the DriverAttributeInfo positional constructor. This is the discovery-side foundation — the generic node-manager can now enrich alarm-bearing variables with OPC UA AlarmConditionState during address-space build (the existing v1 LmxNodeManager pattern that subscribes to <tag>.InAlarm + .Priority + .DescAttrName + .Acked and merges them into a ConditionState) but this PR deliberately stops at discovery: the full alarm subsystem (subscription management for the 4 alarm-status attributes, state-machine tracking for Active/Unacknowledged/Confirmed/Inactive transitions, OPC UA Part 9 alarm event emission, and the write-to-AckMsg ack path) is a follow-up PR 10+ because it touches the node-manager's address-space build path — orthogonal to the IPC flow this PR covers. Tests — AlarmDiscoveryTests (new, 3 cases): GalaxyAttributeInfo_IsAlarm_round_trips_true_through_MessagePack serializes an IsAlarm=true instance and asserts the decoded flag is true + IsHistorized is true + AttributeName survives unchanged; GalaxyAttributeInfo_IsAlarm_round_trips_false_through_MessagePack covers the default path; Pre_PR9_payload_without_IsAlarm_key_deserializes_with_default_false is the wire-compat regression guard — serializes a stand-in PrePR9Shape class with only keys 0..5 (identical layout to the pre-PR9 GalaxyAttributeInfo) and asserts the newer GalaxyAttributeInfo deserializer produces IsAlarm=false without throwing, so a rolling upgrade where the Proxy ships first can talk to an old Host during the window before the Host upgrades without a MessagePack "missing key" exception. Full solution build: 0 errors, 38 warnings (existing). Galaxy.Host.Tests Unit suite: 27 pass / 0 fail (3 new alarm-discovery + 9 PR5 historian + 15 pre-existing). This PR branches off phase-2-pr5-historian because GalaxyProxyDriver's constructor signature + GalaxyHierarchyRow's IsAlarm init-only property are both ancestor state that the simpler branch bases (phase-2-pr4-findings, master) don't yet include.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 06:28:01 -04:00
Joseph Doherty
30ece6e22c Phase 2 PR 8 — wire gateway-level host-status push from MxAccessGalaxyBackend. PR 4 built the IPC infrastructure for OnHostStatusChanged (MessageKind.RuntimeStatusChange frame + ConnectionSink forwarding through FrameWriter) but no backend actually raised the event; the #pragma warning disable CS0067 around MxAccessGalaxyBackend.OnHostStatusChanged declared the event for interface symmetry while acknowledging the wire-up was Phase 2 follow-up. This PR closes the gateway-level signal: MxAccessClient.ConnectionStateChanged (already raised on false→true Register and true→false Unregister transitions, including the reconnect path in MonitorLoopAsync) now drives OnHostStatusChanged with a synthetic HostConnectivityStatus tagged HostName=MxAccessClient.ClientName, RuntimeStatus="Running" on reconnect + "Stopped" on disconnect, LastObservedUtcUnixMs set to the transition moment. The Admin UI's existing IHostConnectivityProbe subscriber on GalaxyProxyDriver (HostStatusChangedEventArgs) already handles the full translation — OnHostConnectivityUpdate parses "Running"/"Stopped"/"Faulted" into the Core.Abstractions HostState enum and fires OnHostStatusChanged downstream, so this single backend-side event wire-up produces an end-to-end signal with no further Proxy changes required. Per-platform and per-AppEngine ScanState probing (the 472 LOC GalaxyRuntimeProbeManager state machine in v1 that advises <Host>.ScanState on every deployed $WinPlatform + $AppEngine gobject, tracks Unknown → Running → Stopped transitions, handles the on-change-only delivery quirk of ScanState, and surfaces IsHostStopped(gobjectId) for the node manager's Read path to short-circuit on-demand reads against known-stopped runtimes) remains deferred to a follow-up PR — the gateway-level signal gives operators the top-level transport-health rung of the status ladder, which is what matters when the Galaxy COM proxy itself goes down (vs a specific platform going down). MxAccessClient.ClientName property exposes the previously-private _clientName field so the backend can tag its pushes with a stable gateway identity — operators configure this via OTOPCUA_GALAXY_CLIENT_NAME env var (default "OtOpcUa-Galaxy.Host" per Program.cs). MxAccessGalaxyBackend constructor subscribes the new _onConnectionStateChanged field before returning + Dispose unsubscribes it via _mx.ConnectionStateChanged -= _onConnectionStateChanged to prevent the backend's own dispose from leaving a dangling handler on the MxAccessClient (same shape as MxAccessClient.SubscriptionReplayFailed PR 6 dispose discipline). #pragma warning disable CS0067 removed from around OnHostStatusChanged since the event is now raised; the directive is narrowed to cover only OnAlarmEvent which stays unraised pending the alarm subsystem port (PR 9 candidate). Tests — HostStatusPushTests (new, 2 cases): ConnectionStateChanged_raises_OnHostStatusChanged_with_gateway_name fires mx.ConnectAsync → mx.DisconnectAsync and asserts two notifications in order with HostName="GatewayClient" (the clientName passed to MxAccessClient ctor), RuntimeStatus="Running" then "Stopped", LastObservedUtcUnixMs > 0; Dispose_unsubscribes_so_post_dispose_state_changes_do_not_fire_events asserts that after backend.Dispose() a subsequent mx.DisconnectAsync does not bump the count on a registered OnHostStatusChanged handler — guards against the subscription-leak regression where a lingering backend instance would accumulate cross-reconnect notifications for a dead writer. Host.Tests csproj gains a Reference to lib/ArchestrA.MxAccess.dll (identical to the reference PR 6 adds — conflict-free cherry-pick/merge since both PRs stage the same <Reference> node; git will collapse to one when either lands first). Full Galaxy.Host.Tests Unit suite: 26 pass / 0 fail (2 new host-status + 9 PR5 historian + 15 pre-existing PostMortemMmf/RecyclePolicy/StaPump/MemoryWatchdog/EndToEndIpc/Handshake). Galaxy.Host builds clean (0 errors, 0 warnings). Branch base — PR 8 is on phase-2-pr5-historian rather than phase-2-pr4-findings because the constructor path on MxAccessGalaxyBackend gained a new historian parameter in PR 5 and the Dispose implementation needs to coordinate the two unsubscribes; targeting the earlier base would leave a trivial conflict on Dispose.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 06:03:16 -04:00
Joseph Doherty
1c2bf74d38 Phase 2 PR 6 — close the 2 low findings carried forward from PR 4. Low finding #1 ($Heartbeat probe handle leak in MonitorLoopAsync): the probe calls _proxy.AddItem(connectionHandle, "$Heartbeat") on every monitor tick that observes the connection is past StaleThreshold, but previously discarded the returned item handle — so every probe (one per MonitorInterval, default 5s) leaked one item handle into the MXAccess proxy's internal handle table. Fix: capture the item handle, call RemoveItem(connectionHandle, probeHandle) in the InvokeAsync's finally block so it runs on the same pump turn as the AddItem, best-effort RemoveItem swallow so a dying proxy doesn't throw secondary exceptions out of the probe path. Probe ok becomes probeHandle > 0 so any AddItem that returns 0 (MXAccess's "could not create") counts as a failed probe, matching v1 behavior. Low finding #2 (subscription replay silently swallowed per-tag failures): after a reconnect, the replay loop iterates the pre-reconnect subscription snapshot and calls SubscribeOnPumpAsync for each; previously those failures went into a bare catch { /* skip */ } so an operator had no signal when specific tags failed to re-subscribe — the first indication downstream was a quality drop on OPC UA clients. Fix: new SubscriptionReplayFailedEventArgs (TagReference + Exception) + SubscriptionReplayFailed event on MxAccessClient that fires once per tag that fails to re-subscribe, Log.Warning per failure with the reconnect counter + tag reference, and a summary log line at the end of the replay loop ("{failed} of {total} failed" or "{total} re-subscribed cleanly"). Serilog using + ILogger Log = Serilog.Log.ForContext<MxAccessClient>() added. Tests — MxAccessClientMonitorLoopTests (new file, 2 cases): Heartbeat_probe_calls_RemoveItem_for_every_AddItem constructs a CountingProxy IMxProxy that tracks AddItem/RemoveItem pair counts scoped to the "$Heartbeat" address, runs the client with MonitorInterval=150ms + StaleThreshold=50ms for 700ms, asserts HeartbeatAddCount > 1, HeartbeatAddCount == HeartbeatRemoveCount, OutstandingHeartbeatHandles == 0 after dispose; SubscriptionReplayFailed_fires_for_each_tag_that_fails_to_replay uses a ReplayFailingProxy that throws on the next $Heartbeat probe (to trigger the reconnect path) and throws on the replay-time AddItem for specified tag names ("BadTag.A", "BadTag.B"), subscribes GoodTag.X + BadTag.A + BadTag.B before triggering probe failure, collects SubscriptionReplayFailed args into a ConcurrentBag, asserts exactly 2 events fired and both bad tags are represented — GoodTag.X replays cleanly so it does not fire. Host.Tests csproj gains a Reference to lib/ArchestrA.MxAccess.dll because IMxProxy's MxDataChangeHandler delegate signature mentions MXSTATUS_PROXY and the compiler resolves all delegate parameter types when a test class implements the interface, even if the test code never names the type. No regressions: full Galaxy.Host.Tests Unit suite 26 pass / 0 fail (2 new monitor-loop tests + 9 PR5 historian + 15 pre-existing PostMortemMmf/RecyclePolicy/StaPump/MemoryWatchdog/EndToEndIpc/Handshake). Galaxy.Host builds clean (0 errors, 0 warnings) — the new Serilog.Log.ForContext usage picks up the existing Serilog package ref that PR 4 pulled in for the monitor-loop infrastructure. Both findings were flagged as non-blocking for PR 4 merge and are now resolved alongside whichever merge order the reviewer picks; this PR branches off phase-2-pr4-findings so it can rebase cleanly if PR 4 lands first or be re-based onto master after PR 4 merges.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 02:06:15 -04:00
13 changed files with 583 additions and 20 deletions

View File

@@ -19,10 +19,17 @@ namespace ZB.MOM.WW.OtOpcUa.Core.Abstractions;
/// <param name="ArrayDim">Declared array length when <see cref="IsArray"/> is true; null otherwise.</param>
/// <param name="SecurityClass">Write-authorization tier for this attribute.</param>
/// <param name="IsHistorized">True when this attribute is expected to feed historian / HistoryRead.</param>
/// <param name="IsAlarm">
/// True when this attribute represents an alarm condition (Galaxy: has an
/// <c>AlarmExtension</c> primitive). The generic node-manager enriches the variable with an
/// OPC UA <c>AlarmConditionState</c> when true. Defaults to false so existing non-Galaxy
/// drivers aren't forced to flow a flag they don't produce.
/// </param>
public sealed record DriverAttributeInfo(
string FullName,
DriverDataType DriverDataType,
bool IsArray,
uint? ArrayDim,
SecurityClassification SecurityClass,
bool IsHistorized);
bool IsHistorized,
bool IsAlarm = false);

View File

@@ -147,6 +147,7 @@ public sealed class DbBackedGalaxyBackend(GalaxyRepository repository) : IGalaxy
ArrayDim = row.ArrayDimension is int d and > 0 ? (uint)d : null,
SecurityClassification = row.SecurityClassification,
IsHistorized = row.IsHistorized,
IsAlarm = row.IsAlarm,
};
/// <summary>

View File

@@ -0,0 +1,46 @@
namespace ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Backend.Historian;
/// <summary>
/// Maps a raw OPC DA quality byte (as returned by Wonderware Historian's <c>OpcQuality</c>)
/// to an OPC UA <c>StatusCode</c> uint. Preserves specific codes (BadNotConnected,
/// UncertainSubNormal, etc.) instead of collapsing to Good/Uncertain/Bad categories.
/// Mirrors v1 <c>QualityMapper.MapToOpcUaStatusCode</c> without pulling in OPC UA types —
/// the returned value is the 32-bit OPC UA <c>StatusCode</c> wire encoding that the Proxy
/// surfaces directly as <c>DataValueSnapshot.StatusCode</c>.
/// </summary>
public static class HistorianQualityMapper
{
/// <summary>
/// Map an 8-bit OPC DA quality byte to the corresponding OPC UA StatusCode. The byte
/// family bits decide the category (Good &gt;= 192, Uncertain 64-191, Bad 0-63); the
/// low-nibble subcode selects the specific code.
/// </summary>
public static uint Map(byte q) => q switch
{
// Good family (192+)
192 => 0x00000000u, // Good
216 => 0x00D80000u, // Good_LocalOverride
// Uncertain family (64-191)
64 => 0x40000000u, // Uncertain
68 => 0x40900000u, // Uncertain_LastUsableValue
80 => 0x40930000u, // Uncertain_SensorNotAccurate
84 => 0x40940000u, // Uncertain_EngineeringUnitsExceeded
88 => 0x40950000u, // Uncertain_SubNormal
// Bad family (0-63)
0 => 0x80000000u, // Bad
4 => 0x80890000u, // Bad_ConfigurationError
8 => 0x808A0000u, // Bad_NotConnected
12 => 0x808B0000u, // Bad_DeviceFailure
16 => 0x808C0000u, // Bad_SensorFailure
20 => 0x80050000u, // Bad_CommunicationError
24 => 0x808D0000u, // Bad_OutOfService
32 => 0x80320000u, // Bad_WaitingForInitialData
// Unknown code — fall back to the category so callers still get a sensible bucket.
_ when q >= 192 => 0x00000000u,
_ when q >= 64 => 0x40000000u,
_ => 0x80000000u,
};
}

View File

@@ -4,6 +4,7 @@ using System.Linq;
using System.Threading;
using System.Threading.Tasks;
using ArchestrA.MxAccess;
using Serilog;
using ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Sta;
namespace ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Backend.MxAccess;
@@ -18,6 +19,8 @@ namespace ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Backend.MxAccess;
/// </summary>
public sealed class MxAccessClient : IDisposable
{
private static readonly ILogger Log = Serilog.Log.ForContext<MxAccessClient>();
private readonly StaPump _pump;
private readonly IMxProxy _proxy;
private readonly string _clientName;
@@ -40,6 +43,16 @@ public sealed class MxAccessClient : IDisposable
/// <summary>Fires whenever the connection transitions Connected ↔ Disconnected.</summary>
public event EventHandler<bool>? ConnectionStateChanged;
/// <summary>
/// Fires once per failed subscription replay after a reconnect. Carries the tag reference
/// and the exception so the backend can propagate the degradation signal (e.g. mark the
/// subscription bad on the Proxy side rather than silently losing its callback). Added for
/// PR 6 low finding #2 — the replay loop previously ate per-tag failures silently and an
/// operator would only find out that a specific subscription stopped updating through a
/// data-quality complaint from downstream.
/// </summary>
public event EventHandler<SubscriptionReplayFailedEventArgs>? SubscriptionReplayFailed;
public MxAccessClient(StaPump pump, IMxProxy proxy, string clientName, MxAccessClientOptions? options = null)
{
_pump = pump;
@@ -54,6 +67,13 @@ public sealed class MxAccessClient : IDisposable
public int SubscriptionCount => _subscriptions.Count;
public int ReconnectCount => _reconnectCount;
/// <summary>
/// Wonderware client identity used when registering with the LMXProxyServer. Surfaced so
/// <see cref="Backend.MxAccessGalaxyBackend"/> can tag its <c>OnHostStatusChanged</c> IPC
/// pushes with a stable gateway name per PR 8.
/// </summary>
public string ClientName => _clientName;
/// <summary>Connects on the STA thread. Idempotent. Starts the reconnect monitor on first call.</summary>
public async Task<int> ConnectAsync()
{
@@ -117,16 +137,29 @@ public sealed class MxAccessClient : IDisposable
if (idle <= _options.StaleThreshold) continue;
// Probe: try a no-op COM call. If the proxy is dead, the call will throw — that's
// our reconnect signal.
// our reconnect signal. PR 6 low finding #1: AddItem allocates an MXAccess item
// handle; we must RemoveItem it on the same pump turn or the long-running monitor
// leaks one handle per probe cycle (one every MonitorInterval seconds, indefinitely).
bool probeOk;
try
{
probeOk = await _pump.InvokeAsync(() =>
{
// AddItem on the connection handle is cheap and round-trips through COM.
// We use a sentinel "$Heartbeat" reference; if it fails the connection is gone.
try { _proxy.AddItem(_connectionHandle, "$Heartbeat"); return true; }
int probeHandle = 0;
try
{
probeHandle = _proxy.AddItem(_connectionHandle, "$Heartbeat");
return probeHandle > 0;
}
catch { return false; }
finally
{
if (probeHandle > 0)
{
try { _proxy.RemoveItem(_connectionHandle, probeHandle); }
catch { /* proxy is dying; best-effort cleanup */ }
}
}
});
}
catch { probeOk = false; }
@@ -155,16 +188,33 @@ public sealed class MxAccessClient : IDisposable
_reconnectCount++;
ConnectionStateChanged?.Invoke(this, true);
// Replay every subscription that was active before the disconnect.
// Replay every subscription that was active before the disconnect. PR 6 low
// finding #2: surface per-tag failures — log them and raise
// SubscriptionReplayFailed so the backend can propagate the degraded state
// (previously swallowed silently; downstream quality dropped without a signal).
var snapshot = _addressToHandle.Keys.ToArray();
_addressToHandle.Clear();
_handleToAddress.Clear();
var failed = 0;
foreach (var fullRef in snapshot)
{
try { await SubscribeOnPumpAsync(fullRef); }
catch { /* skip — operator can re-subscribe */ }
catch (Exception subEx)
{
failed++;
Log.Warning(subEx,
"MXAccess subscription replay failed for {TagReference} after reconnect #{Reconnect}",
fullRef, _reconnectCount);
SubscriptionReplayFailed?.Invoke(this,
new SubscriptionReplayFailedEventArgs(fullRef, subEx));
}
}
if (failed > 0)
Log.Warning("Subscription replay completed — {Failed} of {Total} failed", failed, snapshot.Length);
else
Log.Information("Subscription replay completed — {Total} re-subscribed cleanly", snapshot.Length);
_lastObservedActivityUtc = DateTime.UtcNow;
}
catch

View File

@@ -0,0 +1,20 @@
using System;
namespace ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Backend.MxAccess;
/// <summary>
/// Fired by <see cref="MxAccessClient.SubscriptionReplayFailed"/> when a previously-active
/// subscription fails to be restored after a reconnect. The backend should treat the tag as
/// unhealthy until the next successful resubscribe.
/// </summary>
public sealed class SubscriptionReplayFailedEventArgs : EventArgs
{
public SubscriptionReplayFailedEventArgs(string tagReference, Exception exception)
{
TagReference = tagReference;
Exception = exception;
}
public string TagReference { get; }
public Exception Exception { get; }
}

View File

@@ -34,16 +34,34 @@ public sealed class MxAccessGalaxyBackend : IGalaxyBackend, IDisposable
_refToSubs = new(System.StringComparer.OrdinalIgnoreCase);
public event System.EventHandler<OnDataChangeNotification>? OnDataChange;
#pragma warning disable CS0067 // event not yet raised — alarm + host-status wire-up in PR #4 follow-up
#pragma warning disable CS0067 // alarm wire-up deferred to PR 9
public event System.EventHandler<GalaxyAlarmEvent>? OnAlarmEvent;
public event System.EventHandler<HostConnectivityStatus>? OnHostStatusChanged;
#pragma warning restore CS0067
public event System.EventHandler<HostConnectivityStatus>? OnHostStatusChanged;
private readonly System.EventHandler<bool> _onConnectionStateChanged;
public MxAccessGalaxyBackend(GalaxyRepository repository, MxAccessClient mx, IHistorianDataSource? historian = null)
{
_repository = repository;
_mx = mx;
_historian = historian;
// PR 8: gateway-level host-status push. When the MXAccess COM proxy transitions
// connected↔disconnected, raise OnHostStatusChanged with a synthetic host entry named
// after the Wonderware client identity so the Admin UI surfaces top-level transport
// health even before per-platform/per-engine probing lands (deferred to a later PR that
// ports v1's GalaxyRuntimeProbeManager with ScanState subscriptions).
_onConnectionStateChanged = (_, connected) =>
{
OnHostStatusChanged?.Invoke(this, new HostConnectivityStatus
{
HostName = _mx.ClientName,
RuntimeStatus = connected ? "Running" : "Stopped",
LastObservedUtcUnixMs = DateTimeOffset.UtcNow.ToUnixTimeMilliseconds(),
});
};
_mx.ConnectionStateChanged += _onConnectionStateChanged;
}
public async Task<OpenSessionResponse> OpenSessionAsync(OpenSessionRequest req, CancellationToken ct)
@@ -309,7 +327,11 @@ public sealed class MxAccessGalaxyBackend : IGalaxyBackend, IDisposable
public Task<RecycleStatusResponse> RecycleAsync(RecycleHostRequest req, CancellationToken ct)
=> Task.FromResult(new RecycleStatusResponse { Accepted = true, GraceSeconds = 15 });
public void Dispose() => _historian?.Dispose();
public void Dispose()
{
_mx.ConnectionStateChanged -= _onConnectionStateChanged;
_historian?.Dispose();
}
private static GalaxyDataValue ToWire(string reference, Vtq vtq) => new()
{
@@ -333,19 +355,11 @@ public sealed class MxAccessGalaxyBackend : IGalaxyBackend, IDisposable
TagReference = reference,
ValueBytes = sample.Value is null ? null : MessagePackSerializer.Serialize(sample.Value),
ValueMessagePackType = 0,
StatusCode = MapHistorianQualityToOpcUa(sample.Quality),
StatusCode = HistorianQualityMapper.Map(sample.Quality),
SourceTimestampUtcUnixMs = new DateTimeOffset(sample.TimestampUtc, TimeSpan.Zero).ToUnixTimeMilliseconds(),
ServerTimestampUtcUnixMs = DateTimeOffset.UtcNow.ToUnixTimeMilliseconds(),
};
private static uint MapHistorianQualityToOpcUa(byte q)
{
// Category-only mapping — mirrors QualityMapper.MapToOpcUaStatusCode for the common ranges.
// The Proxy may refine this when it decodes the wire frame.
if (q >= 192) return 0x00000000u; // Good
if (q >= 64) return 0x40000000u; // Uncertain
return 0x80000000u; // Bad
}
/// <summary>
/// Maps a <see cref="HistorianAggregateSample"/> (one aggregate bucket) to the IPC wire
@@ -370,6 +384,7 @@ public sealed class MxAccessGalaxyBackend : IGalaxyBackend, IDisposable
ArrayDim = row.ArrayDimension is int d and > 0 ? (uint)d : null,
SecurityClassification = row.SecurityClassification,
IsHistorized = row.IsHistorized,
IsAlarm = row.IsAlarm,
};
private static string MapCategory(int categoryId) => categoryId switch

View File

@@ -123,7 +123,8 @@ public sealed class GalaxyProxyDriver(GalaxyProxyOptions options)
IsArray: attr.IsArray,
ArrayDim: attr.ArrayDim,
SecurityClass: MapSecurity(attr.SecurityClassification),
IsHistorized: attr.IsHistorized));
IsHistorized: attr.IsHistorized,
IsAlarm: attr.IsAlarm));
}
}
}

View File

@@ -30,6 +30,15 @@ public sealed class GalaxyAttributeInfo
[Key(3)] public uint? ArrayDim { get; set; }
[Key(4)] public int SecurityClassification { get; set; }
[Key(5)] public bool IsHistorized { get; set; }
/// <summary>
/// True when the attribute has an AlarmExtension primitive in the Galaxy repository
/// (<c>primitive_definition.primitive_name = 'AlarmExtension'</c>). The generic
/// node-manager uses this to enrich the variable's OPC UA node with an
/// <c>AlarmConditionState</c> during address-space build. Added in PR 9 as the
/// discovery-side foundation for the alarm event wire-up that follows in PR 10+.
/// </summary>
[Key(6)] public bool IsAlarm { get; set; }
}
[MessagePackObject]

View File

@@ -0,0 +1,84 @@
using System;
using MessagePack;
using Shouldly;
using Xunit;
using ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Shared.Contracts;
namespace ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Tests;
[Trait("Category", "Unit")]
public sealed class AlarmDiscoveryTests
{
/// <summary>
/// PR 9 — IsAlarm must survive the MessagePack round-trip at Key=6 position.
/// Regression guard: any reorder of keys in GalaxyAttributeInfo would silently corrupt
/// the flag in the wire payload since MessagePack encodes by key number, not field name.
/// </summary>
[Fact]
public void GalaxyAttributeInfo_IsAlarm_round_trips_true_through_MessagePack()
{
var input = new GalaxyAttributeInfo
{
AttributeName = "TankLevel",
MxDataType = 2,
IsArray = false,
ArrayDim = null,
SecurityClassification = 1,
IsHistorized = true,
IsAlarm = true,
};
var bytes = MessagePackSerializer.Serialize(input);
var decoded = MessagePackSerializer.Deserialize<GalaxyAttributeInfo>(bytes);
decoded.IsAlarm.ShouldBeTrue();
decoded.IsHistorized.ShouldBeTrue();
decoded.AttributeName.ShouldBe("TankLevel");
}
[Fact]
public void GalaxyAttributeInfo_IsAlarm_round_trips_false_through_MessagePack()
{
var input = new GalaxyAttributeInfo { AttributeName = "ColorRgb", IsAlarm = false };
var bytes = MessagePackSerializer.Serialize(input);
var decoded = MessagePackSerializer.Deserialize<GalaxyAttributeInfo>(bytes);
decoded.IsAlarm.ShouldBeFalse();
}
/// <summary>
/// Wire-compat guard: payloads serialized before PR 9 (which omit Key=6) must still
/// deserialize cleanly — MessagePack treats missing keys as default. This lets a newer
/// Proxy talk to an older Host during a rolling upgrade without a crash.
/// </summary>
[Fact]
public void Pre_PR9_payload_without_IsAlarm_key_deserializes_with_default_false()
{
// Build a 6-field payload (keys 0..5) matching the pre-PR9 shape by serializing a
// stand-in class with the same key layout but no Key=6.
var pre = new PrePR9Shape
{
AttributeName = "Legacy",
MxDataType = 1,
IsArray = false,
ArrayDim = null,
SecurityClassification = 0,
IsHistorized = false,
};
var bytes = MessagePackSerializer.Serialize(pre);
var decoded = MessagePackSerializer.Deserialize<GalaxyAttributeInfo>(bytes);
decoded.AttributeName.ShouldBe("Legacy");
decoded.IsAlarm.ShouldBeFalse();
}
[MessagePackObject]
public sealed class PrePR9Shape
{
[Key(0)] public string AttributeName { get; set; } = string.Empty;
[Key(1)] public int MxDataType { get; set; }
[Key(2)] public bool IsArray { get; set; }
[Key(3)] public uint? ArrayDim { get; set; }
[Key(4)] public int SecurityClassification { get; set; }
[Key(5)] public bool IsHistorized { get; set; }
}
}

View File

@@ -0,0 +1,61 @@
using Shouldly;
using Xunit;
using ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Backend.Historian;
namespace ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Tests;
[Trait("Category", "Unit")]
public sealed class HistorianQualityMapperTests
{
/// <summary>
/// Rich mapping preserves specific OPC DA subcodes through the historian ToWire path.
/// Before PR 12 the category-only fallback collapsed e.g. BadNotConnected(8) to
/// Bad(0x80000000) so downstream OPC UA clients could not distinguish transport issues
/// from sensor issues. After PR 12 every known subcode round-trips to its canonical
/// uint32 StatusCode and Proxy translation stays byte-for-byte with v1 QualityMapper.
/// </summary>
[Theory]
[InlineData((byte)192, 0x00000000u)] // Good
[InlineData((byte)216, 0x00D80000u)] // Good_LocalOverride
[InlineData((byte)64, 0x40000000u)] // Uncertain
[InlineData((byte)68, 0x40900000u)] // Uncertain_LastUsableValue
[InlineData((byte)80, 0x40930000u)] // Uncertain_SensorNotAccurate
[InlineData((byte)84, 0x40940000u)] // Uncertain_EngineeringUnitsExceeded
[InlineData((byte)88, 0x40950000u)] // Uncertain_SubNormal
[InlineData((byte)0, 0x80000000u)] // Bad
[InlineData((byte)4, 0x80890000u)] // Bad_ConfigurationError
[InlineData((byte)8, 0x808A0000u)] // Bad_NotConnected
[InlineData((byte)12, 0x808B0000u)] // Bad_DeviceFailure
[InlineData((byte)16, 0x808C0000u)] // Bad_SensorFailure
[InlineData((byte)20, 0x80050000u)] // Bad_CommunicationError
[InlineData((byte)24, 0x808D0000u)] // Bad_OutOfService
[InlineData((byte)32, 0x80320000u)] // Bad_WaitingForInitialData
public void Maps_specific_OPC_DA_codes_to_canonical_StatusCode(byte quality, uint expected)
{
HistorianQualityMapper.Map(quality).ShouldBe(expected);
}
[Theory]
[InlineData((byte)200)] // Good — unknown subcode in Good family
[InlineData((byte)255)] // Good — unknown
public void Unknown_good_family_codes_fall_back_to_plain_Good(byte q)
{
HistorianQualityMapper.Map(q).ShouldBe(0x00000000u);
}
[Theory]
[InlineData((byte)100)] // Uncertain — unknown subcode
[InlineData((byte)150)] // Uncertain — unknown
public void Unknown_uncertain_family_codes_fall_back_to_plain_Uncertain(byte q)
{
HistorianQualityMapper.Map(q).ShouldBe(0x40000000u);
}
[Theory]
[InlineData((byte)1)] // Bad — unknown subcode
[InlineData((byte)50)] // Bad — unknown
public void Unknown_bad_family_codes_fall_back_to_plain_Bad(byte q)
{
HistorianQualityMapper.Map(q).ShouldBe(0x80000000u);
}
}

View File

@@ -0,0 +1,91 @@
using System;
using System.Collections.Concurrent;
using System.Threading;
using System.Threading.Tasks;
using ArchestrA.MxAccess;
using Shouldly;
using Xunit;
using ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Backend;
using ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Backend.Galaxy;
using ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Backend.MxAccess;
using ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Sta;
using ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Shared.Contracts;
namespace ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Tests;
[Trait("Category", "Unit")]
public sealed class HostStatusPushTests
{
/// <summary>
/// PR 8 — when MxAccessClient.ConnectionStateChanged fires false→true→false,
/// MxAccessGalaxyBackend raises OnHostStatusChanged once per transition with
/// HostName=ClientName, RuntimeStatus="Running"/"Stopped", and a timestamp.
/// This is the gateway-level signal; per-platform ScanState probes are deferred.
/// </summary>
[Fact]
public async Task ConnectionStateChanged_raises_OnHostStatusChanged_with_gateway_name()
{
using var pump = new StaPump("Test.Sta");
await pump.WaitForStartedAsync();
var proxy = new FakeProxy();
var mx = new MxAccessClient(pump, proxy, "GatewayClient", new MxAccessClientOptions { AutoReconnect = false });
using var backend = new MxAccessGalaxyBackend(
new GalaxyRepository(new GalaxyRepositoryOptions { ConnectionString = "Server=.;Database=ZB;Integrated Security=True;" }),
mx,
historian: null);
var notifications = new ConcurrentQueue<HostConnectivityStatus>();
backend.OnHostStatusChanged += (_, s) => notifications.Enqueue(s);
await mx.ConnectAsync();
await mx.DisconnectAsync();
notifications.Count.ShouldBe(2);
notifications.TryDequeue(out var first).ShouldBeTrue();
first!.HostName.ShouldBe("GatewayClient");
first.RuntimeStatus.ShouldBe("Running");
first.LastObservedUtcUnixMs.ShouldBeGreaterThan(0);
notifications.TryDequeue(out var second).ShouldBeTrue();
second!.HostName.ShouldBe("GatewayClient");
second.RuntimeStatus.ShouldBe("Stopped");
}
[Fact]
public async Task Dispose_unsubscribes_so_post_dispose_state_changes_do_not_fire_events()
{
using var pump = new StaPump("Test.Sta");
await pump.WaitForStartedAsync();
var proxy = new FakeProxy();
var mx = new MxAccessClient(pump, proxy, "GatewayClient", new MxAccessClientOptions { AutoReconnect = false });
var backend = new MxAccessGalaxyBackend(
new GalaxyRepository(new GalaxyRepositoryOptions { ConnectionString = "Server=.;Database=ZB;Integrated Security=True;" }),
mx,
historian: null);
var count = 0;
backend.OnHostStatusChanged += (_, _) => Interlocked.Increment(ref count);
await mx.ConnectAsync();
count.ShouldBe(1);
backend.Dispose();
await mx.DisconnectAsync();
count.ShouldBe(1); // no second notification after Dispose
}
private sealed class FakeProxy : IMxProxy
{
private int _next = 1;
public int Register(string _) => 42;
public void Unregister(int _) { }
public int AddItem(int _, string __) => Interlocked.Increment(ref _next);
public void RemoveItem(int _, int __) { }
public void AdviseSupervisory(int _, int __) { }
public void UnAdviseSupervisory(int _, int __) { }
public void Write(int _, int __, object ___, int ____) { }
public event MxDataChangeHandler? OnDataChange { add { } remove { } }
public event MxWriteCompleteHandler? OnWriteComplete { add { } remove { } }
}
}

View File

@@ -0,0 +1,173 @@
using System;
using System.Collections.Concurrent;
using System.Collections.Generic;
using System.Threading;
using System.Threading.Tasks;
using ArchestrA.MxAccess;
using Shouldly;
using Xunit;
using ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Backend.MxAccess;
using ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Sta;
namespace ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Tests;
[Trait("Category", "Unit")]
public sealed class MxAccessClientMonitorLoopTests
{
/// <summary>
/// PR 6 low finding #1 — every $Heartbeat probe must RemoveItem the item handle it
/// allocated. Without that, the monitor leaks one handle per MonitorInterval seconds,
/// which over a 24h uptime becomes thousands of leaked MXAccess handles and can
/// eventually exhaust the runtime proxy's handle table.
/// </summary>
[Fact]
public async Task Heartbeat_probe_calls_RemoveItem_for_every_AddItem()
{
using var pump = new StaPump("Monitor.Sta");
await pump.WaitForStartedAsync();
var proxy = new CountingProxy();
var client = new MxAccessClient(pump, proxy, "probe-test", new MxAccessClientOptions
{
AutoReconnect = true,
MonitorInterval = TimeSpan.FromMilliseconds(150),
StaleThreshold = TimeSpan.FromMilliseconds(50),
});
await client.ConnectAsync();
// Wait past StaleThreshold, then let several monitor cycles fire.
await Task.Delay(700);
client.Dispose();
// One Heartbeat probe fires per monitor tick once the connection looks stale.
proxy.HeartbeatAddCount.ShouldBeGreaterThan(1);
// Every AddItem("$Heartbeat") must be matched by a RemoveItem on the same handle.
proxy.HeartbeatAddCount.ShouldBe(proxy.HeartbeatRemoveCount);
proxy.OutstandingHeartbeatHandles.ShouldBe(0);
}
/// <summary>
/// PR 6 low finding #2 — after reconnect, per-subscription replay failures must raise
/// SubscriptionReplayFailed so the backend can propagate the degradation, not get
/// silently eaten.
/// </summary>
[Fact]
public async Task SubscriptionReplayFailed_fires_for_each_tag_that_fails_to_replay()
{
using var pump = new StaPump("Replay.Sta");
await pump.WaitForStartedAsync();
var proxy = new ReplayFailingProxy(failOnReplayForTags: new[] { "BadTag.A", "BadTag.B" });
var client = new MxAccessClient(pump, proxy, "replay-test", new MxAccessClientOptions
{
AutoReconnect = true,
MonitorInterval = TimeSpan.FromMilliseconds(120),
StaleThreshold = TimeSpan.FromMilliseconds(50),
});
var failures = new ConcurrentBag<SubscriptionReplayFailedEventArgs>();
client.SubscriptionReplayFailed += (_, e) => failures.Add(e);
await client.ConnectAsync();
await client.SubscribeAsync("GoodTag.X", (_, _) => { });
await client.SubscribeAsync("BadTag.A", (_, _) => { });
await client.SubscribeAsync("BadTag.B", (_, _) => { });
proxy.TriggerProbeFailureOnNextCall();
// Wait for the monitor loop to probe → fail → reconnect → replay.
await Task.Delay(800);
client.Dispose();
failures.Count.ShouldBe(2);
var names = new HashSet<string>();
foreach (var f in failures) names.Add(f.TagReference);
names.ShouldContain("BadTag.A");
names.ShouldContain("BadTag.B");
}
// ----- test doubles -----
private sealed class CountingProxy : IMxProxy
{
private int _next = 1;
private readonly ConcurrentDictionary<int, string> _live = new();
public int HeartbeatAddCount;
public int HeartbeatRemoveCount;
public int OutstandingHeartbeatHandles => _live.Count;
public event MxDataChangeHandler? OnDataChange { add { } remove { } }
public event MxWriteCompleteHandler? OnWriteComplete { add { } remove { } }
public int Register(string _) => 42;
public void Unregister(int _) { }
public int AddItem(int _, string address)
{
var h = Interlocked.Increment(ref _next);
_live[h] = address;
if (address == "$Heartbeat") Interlocked.Increment(ref HeartbeatAddCount);
return h;
}
public void RemoveItem(int _, int itemHandle)
{
if (_live.TryRemove(itemHandle, out var addr) && addr == "$Heartbeat")
Interlocked.Increment(ref HeartbeatRemoveCount);
}
public void AdviseSupervisory(int _, int __) { }
public void UnAdviseSupervisory(int _, int __) { }
public void Write(int _, int __, object ___, int ____) { }
}
/// <summary>
/// Mock that lets us exercise the reconnect + replay path. TriggerProbeFailureOnNextCall
/// flips a one-shot flag so the very next AddItem("$Heartbeat") throws — that drives the
/// monitor loop into the reconnect-with-replay branch. During the replay, AddItem for the
/// tags listed in failOnReplayForTags throws so SubscriptionReplayFailed should fire once
/// per failing tag.
/// </summary>
private sealed class ReplayFailingProxy : IMxProxy
{
private int _next = 1;
private readonly HashSet<string> _failOnReplay;
private int _probeFailOnce;
private readonly ConcurrentDictionary<string, bool> _replayedOnce = new(StringComparer.OrdinalIgnoreCase);
public ReplayFailingProxy(IEnumerable<string> failOnReplayForTags)
{
_failOnReplay = new HashSet<string>(failOnReplayForTags, StringComparer.OrdinalIgnoreCase);
}
public void TriggerProbeFailureOnNextCall() => Interlocked.Exchange(ref _probeFailOnce, 1);
public event MxDataChangeHandler? OnDataChange { add { } remove { } }
public event MxWriteCompleteHandler? OnWriteComplete { add { } remove { } }
public int Register(string _) => 42;
public void Unregister(int _) { }
public int AddItem(int _, string address)
{
if (address == "$Heartbeat" && Interlocked.Exchange(ref _probeFailOnce, 0) == 1)
throw new InvalidOperationException("simulated probe failure");
// Fail only on the *replay* AddItem for listed tags — not the initial subscribe.
if (_failOnReplay.Contains(address) && _replayedOnce.ContainsKey(address))
throw new InvalidOperationException($"simulated replay failure for {address}");
if (_failOnReplay.Contains(address)) _replayedOnce[address] = true;
return Interlocked.Increment(ref _next);
}
public void RemoveItem(int _, int __) { }
public void AdviseSupervisory(int _, int __) { }
public void UnAdviseSupervisory(int _, int __) { }
public void Write(int _, int __, object ___, int ____) { }
}
}

View File

@@ -24,6 +24,11 @@
<ItemGroup>
<ProjectReference Include="..\..\src\ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host\ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.csproj"/>
<Reference Include="System.ServiceProcess"/>
<!-- IMxProxy's delegate signatures mention ArchestrA.MxAccess.MXSTATUS_PROXY, so tests
implementing the interface must resolve that type at compile time. -->
<Reference Include="ArchestrA.MxAccess">
<HintPath>..\..\lib\ArchestrA.MxAccess.dll</HintPath>
</Reference>
</ItemGroup>
<ItemGroup>