master
3 Commits
| Author | SHA1 | Message | Date | |
|---|---|---|---|---|
|
|
5e11b30507 |
[F56 resolved] subscribe paths now drive 0x33 DataUpdate frames
Root cause: `Session::subscribe` and `Session::subscribe_buffered_nmx`
were missing the `INmxService2::Connect` + `AddSubscriberEngine` RPC
pair that the .NET reference's `MxNativeSession.EnsurePublisherConnected`
(`cs:516-526`) issues before the first advise against a publishing
engine. Without those two RPCs, NmxSvc accepted the subscription
registration but the publishing engine never knew our engine was
subscribed — so it never dispatched DataUpdate frames back.
Diagnosis driven by wwtools/aalogcli reading
C:\ProgramData\ArchestrA\LogFiles. The user pointed at this tooling
which lit up the path.
Red herring: NmxSvc's `[Warning] NmxCallback->DataReceived ... failed
with error 0x{N}` log lines turned out to be normal log spam where N
is the bufferSize of the inbound call, not a real error code. The
.NET reference's own probe triggers identical entries while still
receiving DataUpdate frames successfully.
Fix:
- SessionInner::publisher_endpoints — per-session HashMap<(platform_id,
engine_id), ()> cache mirroring MxNativeSession._publisherEndpoints.
- Session::ensure_publisher_connected — issues Connect +
AddSubscriberEngine, once per publisher endpoint per session.
- Session::subscribe + subscribe_buffered_nmx — both call it before
the wire advise.
- subscribe_buffered_nmx — additionally issues AdviseSupervisory after
RegisterReference. The .NET reference's RegisterBufferedItemAsync
only calls RegisterReference, but on this AVEVA install
RegisterReference alone produces the registration result + heartbeat
callbacks without ever starting DataUpdate dispatch; AdviseSupervisory
unblocks the dispatch.
Live verification (`TestMachine_001.TestChangingInt`, a tag that
updates >1×/s):
cargo test -p mxaccess-compat --features live-windows-com \
--test plain_subscribe_live -- --ignored --nocapture
cargo test -p mxaccess-compat --features live-windows-com \
--test buffered_subscribe_live -- --ignored --nocapture
Both pass — `cmd=0x32` SubscriptionStatus + sequence of `cmd=0x33`
DataUpdate frames flow as expected. Tests assert on the raw
Session::callbacks() broadcast (not the typed Subscription::next
DataChange path) because the engine reports quality=Uncertain
value=null for this attribute on this Galaxy — the wire-level
subscription is what F56 was about, not the value content.
DcomCallbackSink reverted to S_OK return for both DataReceivedRaw
and StatusReceivedRaw (the bytes-processed / sentinel HRESULT
experiments during diagnosis turned out to be irrelevant — the
"failed with error 0xN" logs come from NmxSvc regardless of the
return value).
design/followups.md F49 + F56 + docs/M6-live-verification.md updated:
F56 resolved, F49 steps 1 + 4 + 5 pass live, steps 2 + 3 pending
(now executable on this fixture).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
|
|
df3457c54a |
[F56] subscribe / subscribe_buffered: split-form wire body + diagnose Galaxy fixture gap
Three real fixes + one architectural diagnosis:
1. Session::subscribe_buffered_nmx now sends the .NET-reference split
form on the wire:
item_definition = "<attr>.property(buffer)" (was: full reference)
item_context = "<object_tag_name>" (was: empty)
item_handle = SessionInner::next_item_handle.fetch_add(1)
(was: hardcoded 0)
Verified byte-identical against captures/082 + 094 by the existing
buffered_register_reference_parity unit tests. The
item_handle counter mirrors MxNativeCompatibilityServer's
_nextItemHandle++ at MxNativeSession.cs:613.
2. New live tests:
- tests/buffered_subscribe_live.rs (F49 step 1) — uses real Galaxy
metadata via SqlTagResolver + connect_nmx_auto, drives a
background writer at 500ms cadence to force value-changes,
drains DataChange events from Subscription.
- tests/plain_subscribe_live.rs — same harness over plain
Session::subscribe (NOT buffered), used to isolate whether
"no DataUpdate" is buffered-specific (it's not — both fail).
Both pull tracing-subscriber as a dev-dep so `RUST_LOG=trace`
surfaces dcom_sink + router activity.
3. mxaccess-galaxy/sql_resolver.rs: drop the inner-attribute
`#![cfg(feature = "galaxy-resolver")]` — the module-level cfg on
`pub mod sql_resolver` in lib.rs already handles this and Rust
1.85's clippy::duplicated_attributes lint flagged the duplicate
once mxaccess-compat dev-deps activated the feature.
4. F56 finding (diagnosis, NOT a bug fix): the engine on this Galaxy
install does not have an active value for TestChildObject.TestInt.
Confirmed by running the .NET reference's own probe:
dotnet run --project src/MxNativeClient.Probe -c Release \
-- --probe-session-subscribe --tag=TestChildObject.TestInt \
--subscribe-hold-seconds=10
...returns ONE 0x32 SubscriptionStatus (status=3 detail=3
quality=0x00C0 Uncertain value=null) and zero 0x33 DataUpdates —
matching the Rust port's symptom exactly. Not a Rust port bug,
not a wire-byte gap. F49 steps 1-3 need either an actively-
scanned tag or local Galaxy reconfiguration to scan
TestChildObject.TestInt.
Workspace tests + clippy clean under both feature configurations.
F56 entry in design/followups.md updated with the full diagnostic
chain so future-me / future-collaborators can pick it up without
re-tracing.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
|
|
af15fe7587 |
[F49 step 1 + F56] callback router: peel envelope before parsing subscription / 0x11 frames
The router used to call NmxSubscriptionMessage::parse_inner directly on the COM-stub-delivered body, but the wire bytes arrive wrapped in a ProcessDataReceived envelope (46-byte header + optional 4-byte length prefix); parse_inner expects post-envelope bytes. Result: every 0x33 DataUpdate that ever arrived was silently dropped. Mirrors the .NET reference's MxNativeSession.OnCallbackReceived flow at cs:582-606 — three sequential parse attempts: 1. NmxOperationStatusMessage::try_parse_process_data_received_body (already wired) 2. NmxReferenceRegistrationResultMessage::try_parse_... (NEW — was missing) 3. NmxSubscriptionMessage::try_parse_process_data_received_body (NEW — was wrong) Adds: - NmxSubscriptionMessage::try_parse_process_data_received_body — peels envelope via NmxObservedEnvelope::parse_process_data_received_body_flexible, then dispatches to existing parse_inner. - NmxReferenceRegistrationResultMessage::try_parse_process_data_received_body — same shape, for the 0x11 registration-result frame. - Router branch for 0x11 — currently traces the assigned item_handle and drops the frame (matches the .NET reference, which fires a ReferenceRegistrationReceived event with no consumer in the codebase). - Router fall-through trace! when neither path matches, so future unparseable bodies surface in RUST_LOG=trace instead of vanishing. - DcomCallbackSink::forward — trace! per inbound callback so RUST_LOG=mxaccess_callback=trace surfaces opnum + size. - crates/mxaccess-compat/tests/buffered_subscribe_live.rs — F49 step 1 live test that drives subscribe_buffered + a 500ms-cadence writer. Also pulls tracing-subscriber as a dev-dep so the test can dump router activity. Existing router_task_decodes_callback_invoked_into_broadcast unit test updated to wrap its synthetic 0x32 body in an envelope so the new parse path actually accepts it. Live result: F56 — the buffered round-trip *registers* successfully (RegisterReference returns HRESULT 0; engine sends one 0x11 RegistrationResult + one 51-byte op-status per write, perfectly clocked) but the engine never sends a 0x33 DataUpdate. Rust-port- specific gap vs the .NET reference's working buffered path; root cause is likely a field-level difference in the RegisterReference body or a missing post-RegisterReference step. Captured as F56 in design/followups.md, blocking F49 step 1; F56's DoD is the same live test reporting >=3 DataChange arrivals. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |