probe: aaAlarmManagedClient receives no alarm data — full consumer chain verified
Sixth probe iteration with every consumer-side knob exhausted: - Subscriptions tried (all rc=0): \Galaxy!, \Galaxy!*, \Galaxy!, \Galaxy!TestArea, \.\Galaxy!. - Read channels polled at 500ms: GetStatistics, GetHighPriAlarm, SFCreateSnapshot + SFGetStatistics. - Filters: priority 0..32767, qtSummary + qtHistory both tried, asAlarmActiveNow. - AlarmRecord pre-init to FILETIME epoch to dodge marshaler bug on default(DateTime). Result: every read API returns empty for the entire 60s window even with TestMachine_001.TestAlarm001 firing every 10s and aaObjectViewer confirming InAlarm transitions. The aaAlarmManagedClient.AlarmClient is not the receive surface AVEVA's alarm pipeline routes to in this Galaxy configuration. The consumer chain is verified working end-to-end: Initialize + Register + Subscribe all succeed, GetProviders finds the provider, the WM 0xC275 heartbeat fires at 1Hz to AVEVA's internal hwnd. There is simply no alarm data flowing through this consumer surface. Next investigation is not consumer-side: either find the SDK aaObjectViewer's alarm panel uses, or query the historian event storage directly. If alarms only flow via the historian path on this customer's Galaxy, the worker's PR A.5 architecture is a dead-end and A.2 needs a different transport. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -258,16 +258,82 @@ the script's writes. So the alarm extension is **evaluating**
|
||||
its condition, just not visibly producing transitions on the
|
||||
`aaAlarmManagedClient` consumer stream.
|
||||
|
||||
This isolates the unknown to the producer-side path — whether
|
||||
the BoolAlarm extension's "publish to alarm manager" knob is on,
|
||||
whether the platform is in an alarm area that matches the
|
||||
consumer's subscription scope, or whether AVEVA has a separate
|
||||
"events" path the BoolAlarm uses by default that this consumer
|
||||
doesn't subscribe to. Resolving requires checking the BoolAlarm
|
||||
extension's config in System Platform IDE (alarm priority,
|
||||
category, "Active"/"Enabled" flags, alarm-vs-event mode) and
|
||||
checking whether `aaObjectViewer`'s Active Alarms panel sees the
|
||||
alarm fire.
|
||||
## Multi-channel + multi-subscription probe — sixth run, 2026-05-01
|
||||
|
||||
Extended the probe to try every consumer-side approach in
|
||||
parallel:
|
||||
|
||||
- **Subscription expressions** (sequential): `\Galaxy!`,
|
||||
`\Galaxy!*`, `\\Galaxy!`, `\Galaxy!TestArea`, `\\.\Galaxy!`.
|
||||
All Subscribe calls returned rc=0; the last one
|
||||
(`\\.\Galaxy!`) is reflected in `GetProviders` (count=1).
|
||||
- **Read channels** polled at 500ms cadence: `GetStatistics`,
|
||||
`GetHighPriAlarm`, `SFCreateSnapshot` + `SFGetStatistics`.
|
||||
- **Filter+sort**: priority 0..32767, `qtSummary`,
|
||||
state=`asAlarmActiveNow`, sort=`sfReturnNewestFirst`.
|
||||
- **AlarmRecord init** (worked around `Not a valid Win32
|
||||
FileTime` exception): all DateTime fields pre-set to FILETIME
|
||||
epoch (1601-01-01 UTC) before the call, since
|
||||
`default(DateTime)` is outside FILETIME range and trips the
|
||||
interop marshaler.
|
||||
|
||||
Result of the 60s run with `TestMachine_001.TestAlarm001` being
|
||||
flipped every 10s:
|
||||
|
||||
```
|
||||
Subscribe('\Galaxy!') -> 0
|
||||
Subscribe('\Galaxy!*') -> 0
|
||||
Subscribe('\\Galaxy!') -> 0
|
||||
Subscribe('\Galaxy!TestArea') -> 0
|
||||
Subscribe('\\.\Galaxy!') -> 0
|
||||
GetProviders [after Subscribe-multi] -> count=1 list=[ 0 \\.\Galaxy!]
|
||||
GetStatistics #1: total=0 active=0 changes=1 codes=[7] positions=[] handles=[]
|
||||
GetHighPriAlarm #1: rc=0 { }
|
||||
SF channel #1: SFCreate=0 numAlarms=0 SFStats=0 unackRet=0 unackAlm=0 ackAlm=0 others=0 events=0 idxNewest=-1
|
||||
```
|
||||
|
||||
**No further "(changed)" entries for the entire 60s window.**
|
||||
Every read API returned the same empty result on every poll.
|
||||
|
||||
User confirms the alarm IS firing — `aaObjectViewer` sees
|
||||
`$Alarm.InAlarm` flip in lockstep with the script. Historian
|
||||
records exist (per user — needs verification by querying the
|
||||
historian directly).
|
||||
|
||||
## Conclusion of consumer-side probing
|
||||
|
||||
`aaAlarmManagedClient.AlarmClient` is **not** the receive
|
||||
surface AVEVA's alarm pipeline routes to in this Galaxy
|
||||
configuration. The consumer chain is verified end-to-end:
|
||||
|
||||
- `InitializeConsumer` + `RegisterConsumer` + `Subscribe` all
|
||||
succeed (rc=0).
|
||||
- `GetProviders` finds `\Galaxy!` once Initialize is called.
|
||||
- All read APIs (`GetStatistics`, `GetHighPriAlarm`,
|
||||
`SFCreateSnapshot`/`SFGetStatistics`) return empty even with
|
||||
every documented filter combination.
|
||||
- The consumer's hWnd receives zero AVEVA messages between
|
||||
`WM_CREATE` and `WM_DESTROY`; AVEVA's traffic goes to its own
|
||||
internal hwnd.
|
||||
|
||||
The next investigation directions are not consumer-side:
|
||||
|
||||
1. **Inspect `aaObjectViewer`'s alarm SDK** to see what library
|
||||
it uses to read alarms. If different from
|
||||
`aaAlarmManagedClient`, switch the worker over.
|
||||
2. **Query the historian directly** (`aahEventStorage` /
|
||||
`aahEventSvc`) to confirm alarms are recorded — and use the
|
||||
same path for v2 alarm capture.
|
||||
3. **Inspect AVEVA's alarm-routing config** for this Galaxy in
|
||||
System Platform IDE — area assignments, alarm provider
|
||||
bindings, "publish alarm events to" settings on the platform.
|
||||
|
||||
For A.2 implementation: the `aaAlarmManagedClient` path the
|
||||
gateway-worker is currently architected around may be a
|
||||
dead-end on customer Galaxies configured this way. If the
|
||||
alarms truly only flow through the historian event-storage path,
|
||||
A.2 needs to consume from `aahEventStorage` instead — a
|
||||
fundamental architecture pivot.
|
||||
|
||||
### Implications for A.2 implementation
|
||||
|
||||
|
||||
@@ -32,6 +32,17 @@ namespace MxGateway.Worker.Tests;
|
||||
public sealed class AlarmClientWmProbeTests : IDisposable
|
||||
{
|
||||
// Probe configuration. Override in the constructor below if needed.
|
||||
// Try multiple subscription expressions sequentially (each Subscribe call
|
||||
// adds to the consumer's scope). The "everything" form varies by AVEVA
|
||||
// version — we shotgun common forms.
|
||||
private static readonly string[] SubscriptionExpressions =
|
||||
{
|
||||
@"\Galaxy!", // documented "all groups under Galaxy provider"
|
||||
@"\Galaxy!*", // wildcard variant
|
||||
@"\\Galaxy!", // double-backslash UNC-style
|
||||
@"\Galaxy!TestArea", // explicit area where TestMachine_001 lives
|
||||
@"\\.\Galaxy!", // local-host prefix
|
||||
};
|
||||
private const string SubscriptionExpression = @"\Galaxy!";
|
||||
private static readonly TimeSpan PumpDuration = TimeSpan.FromSeconds(60);
|
||||
private static readonly TimeSpan PollInterval = TimeSpan.FromMilliseconds(500);
|
||||
@@ -281,16 +292,28 @@ public sealed class AlarmClientWmProbeTests : IDisposable
|
||||
// literally mean "match alarms in state 'none'" (i.e., nothing),
|
||||
// since the eAlarmFilterState enum is 0/1/2/3 single-states not
|
||||
// flag bits. Try ActiveNow explicitly.
|
||||
int subscribe = client.Subscribe(
|
||||
szSubscription: SubscriptionExpression,
|
||||
wFromPri: 0, wToPri: short.MaxValue,
|
||||
QueryType: eQueryType.qtHistory,
|
||||
SortFlags: eSortFlags.sfReturnNewestFirst,
|
||||
FilterMask: eAlarmFilterState.asAlarmActiveNow,
|
||||
FilterSpecification: eAlarmFilterState.asAlarmActiveNow);
|
||||
Log($"Subscribe('{SubscriptionExpression}', qtHistory, state=ActiveNow, pri=[0..32767]) -> {subscribe}");
|
||||
// Subscribe to every candidate expression — AVEVA accepts multiple
|
||||
// overlapping subscriptions; whichever matches the producer wins.
|
||||
foreach (string expr in SubscriptionExpressions)
|
||||
{
|
||||
try
|
||||
{
|
||||
int subscribe = client.Subscribe(
|
||||
szSubscription: expr,
|
||||
wFromPri: 0, wToPri: short.MaxValue,
|
||||
QueryType: eQueryType.qtSummary,
|
||||
SortFlags: eSortFlags.sfReturnNewestFirst,
|
||||
FilterMask: eAlarmFilterState.asAlarmActiveNow,
|
||||
FilterSpecification: eAlarmFilterState.asAlarmActiveNow);
|
||||
Log($"Subscribe('{expr}') -> {subscribe}");
|
||||
}
|
||||
catch (Exception ex)
|
||||
{
|
||||
Log($"Subscribe('{expr}') threw: {ex.GetType().Name}: {ex.Message}");
|
||||
}
|
||||
}
|
||||
|
||||
LogProviders(client, "after Subscribe");
|
||||
LogProviders(client, "after Subscribe-multi");
|
||||
|
||||
// 3c. Pump for the configured duration. Log every message we see
|
||||
// (filtered light to avoid noise from WM_PAINT / WM_TIMER /
|
||||
@@ -322,6 +345,7 @@ public sealed class AlarmClientWmProbeTests : IDisposable
|
||||
{
|
||||
PollGetStatistics(client, ++pollCount);
|
||||
LogProviders(client, $"poll #{pollCount}");
|
||||
PollAllChannels(client, pollCount);
|
||||
nextPoll = DateTime.UtcNow + PollInterval;
|
||||
}
|
||||
Thread.Sleep(10);
|
||||
@@ -349,6 +373,122 @@ public sealed class AlarmClientWmProbeTests : IDisposable
|
||||
|
||||
private string lastStatsSummary = string.Empty;
|
||||
private string lastProvidersSummary = string.Empty;
|
||||
private string lastHighPriSummary = string.Empty;
|
||||
private string lastSfStatsSummary = string.Empty;
|
||||
|
||||
/// <summary>
|
||||
/// Try every read API the AlarmClient exposes and log when its
|
||||
/// output changes. AlarmClient has at least three distinct read
|
||||
/// surfaces — GetStatistics (current-change array), GetHighPriAlarm
|
||||
/// (single-record peek), and the SF (stored filter) family — and any
|
||||
/// of them might be the populated one.
|
||||
/// </summary>
|
||||
private static AlarmRecord NewAlarmRecord()
|
||||
{
|
||||
// The interop's auto-marshal flips DateTime fields to FILETIME on
|
||||
// the way IN as well as OUT. default(DateTime) (year 1) is outside
|
||||
// FILETIME's representable range, so initialize all DateTime fields
|
||||
// to the FILETIME epoch (1601-01-01 UTC) to satisfy the marshaler.
|
||||
AlarmRecord rec = new AlarmRecord();
|
||||
DateTime epoch = new DateTime(1601, 1, 1, 0, 0, 0, DateTimeKind.Utc);
|
||||
foreach (var f in typeof(AlarmRecord).GetFields(
|
||||
BindingFlags.Public | BindingFlags.Instance | BindingFlags.NonPublic))
|
||||
{
|
||||
if (f.FieldType == typeof(DateTime))
|
||||
{
|
||||
object boxed = rec;
|
||||
f.SetValue(boxed, epoch);
|
||||
rec = (AlarmRecord)boxed;
|
||||
}
|
||||
}
|
||||
return rec;
|
||||
}
|
||||
|
||||
private void PollAllChannels(AlarmClient client, int seq)
|
||||
{
|
||||
// Channel A: GetHighPriAlarm — direct peek of highest-priority alarm.
|
||||
try
|
||||
{
|
||||
AlarmRecord rec = NewAlarmRecord();
|
||||
int rc = client.GetHighPriAlarm(ref rec);
|
||||
string desc = rc == 0 ? DescribeAlarmRecord(rec) : "<no record>";
|
||||
string summary = $"rc={rc} {desc}";
|
||||
if (summary != lastHighPriSummary)
|
||||
{
|
||||
Log($"GetHighPriAlarm #{seq}: {summary} (changed)");
|
||||
lastHighPriSummary = summary;
|
||||
}
|
||||
}
|
||||
catch (Exception ex)
|
||||
{
|
||||
string es = $"{ex.GetType().Name}: {ex.Message}";
|
||||
if (es != lastHighPriSummary)
|
||||
{
|
||||
Log($"GetHighPriAlarm #{seq}: threw {es}");
|
||||
lastHighPriSummary = es;
|
||||
}
|
||||
}
|
||||
|
||||
// Channel C: GetAlarmExtendedRec by index. Try indices 0..3 directly;
|
||||
// populated alarms (if any) appear at low indices.
|
||||
for (int idx = 0; idx <= 2; idx++)
|
||||
{
|
||||
try
|
||||
{
|
||||
AlarmRecord rec = NewAlarmRecord();
|
||||
int rc = client.GetAlarmExtendedRec(idx, ref rec);
|
||||
if (rc == 0)
|
||||
{
|
||||
string desc = DescribeAlarmRecord(rec);
|
||||
Log($"GetAlarmExtendedRec(idx={idx}) #{seq}: rc=0 -> {desc}");
|
||||
break; // log first present record only
|
||||
}
|
||||
}
|
||||
catch (Exception ex)
|
||||
{
|
||||
if (idx == 0)
|
||||
{
|
||||
Log($"GetAlarmExtendedRec(idx=0) #{seq}: threw {ex.GetType().Name}: {ex.Message}");
|
||||
}
|
||||
break;
|
||||
}
|
||||
}
|
||||
|
||||
// Channel B: SF — snapshot + GetStatistics + iterate.
|
||||
try
|
||||
{
|
||||
uint numAlarms = 0;
|
||||
int sfCreate = client.SFCreateSnapshot(0, ref numAlarms);
|
||||
int unackRet = 0, unackAlm = 0, ackAlm = 0, others = 0, events = 0, idxNewest = 0;
|
||||
int sfStats = client.SFGetStatistics(
|
||||
ref unackRet, ref unackAlm, ref ackAlm,
|
||||
ref others, ref events, ref idxNewest);
|
||||
string summary = $"SFCreate={sfCreate} numAlarms={numAlarms} " +
|
||||
$"SFStats={sfStats} unackRet={unackRet} unackAlm={unackAlm} " +
|
||||
$"ackAlm={ackAlm} others={others} events={events} idxNewest={idxNewest}";
|
||||
if (summary != lastSfStatsSummary)
|
||||
{
|
||||
Log($"SF channel #{seq}: {summary} (changed)");
|
||||
lastSfStatsSummary = summary;
|
||||
|
||||
// If non-zero, fetch the first record by index via the
|
||||
// standard GetAlarmExtendedRec — after SFCreateSnapshot the
|
||||
// indices reference the snapshot.
|
||||
if (numAlarms > 0)
|
||||
{
|
||||
AlarmRecord rec = new AlarmRecord();
|
||||
int recRc = client.GetAlarmExtendedRec(0, ref rec);
|
||||
Log($" GetAlarmExtendedRec(0) [post-snapshot] rc={recRc} -> {DescribeAlarmRecord(rec)}");
|
||||
}
|
||||
}
|
||||
client.SFDeleteSnapshot();
|
||||
}
|
||||
catch (Exception ex)
|
||||
{
|
||||
Log($"SF channel #{seq}: threw {ex.GetType().Name}: {ex.Message}");
|
||||
}
|
||||
}
|
||||
|
||||
|
||||
private void LogProviders(AlarmClient client, string when)
|
||||
{
|
||||
|
||||
Reference in New Issue
Block a user