ddebab2c2d179cb95330de0558778ad2252c8060
137 Commits
| Author | SHA1 | Message | Date | |
|---|---|---|---|---|
|
|
ddebab2c2d |
docs: F3 cross-domain NTLM provisioning recipe
Self-contained doc at docs/F3-cross-domain-ntlm-recipe.md for whoever picks F3 up on hardware with two AD forests + a forest trust. Covers: - Lab topology (LAB-A resource forest with AVEVA install + LAB-B account forest with the probe user, bidirectional forest trust). - DC + DNS + trust + user provisioning steps (Install-ADDSForest, Add-DnsServerConditionalForwarderZone, New-ADTrust, New-ADUser). - Capture procedure for both the Rust and .NET probes under a `runas /netonly` cross-domain token, with Wireshark NTLMSSP guidance. - Fixture layout under crates/mxaccess-rpc/tests/fixtures/cross-domain-ntlm/. - Round-trip test skeleton (replay the captured Type 2 → regenerate Type 3 → assert byte-equality against the captured Type 3). - Redaction checklist for the captured bytes. - Why F3 is "evidence work" not "codec work" — the AV pair parser is shape-agnostic, so the codec path is already correct; the fixture is a regression net for any future drift. F3 entry in design/followups.md and R8 in design/70-risks-and-open-questions.md both now point at the recipe so a future contributor doesn't have to reconstruct the lab topology from the followup analysis alone. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
73e2bd8771 |
followups: status snapshot for the Open section
After F52 closed, every entry in the Open section except F3 has a `**Status:**` line documenting its own resolution (Resolved 2026-05-06, or Out-of-scope). At a glance the section misleadingly looks like 8 live items. Add a header snapshot calling out that only F3 — cross-domain NTLM fixture, externally blocked on a second AD domain — is genuinely open. The other entries stay where they are because the F-numbers in their analysis are referenced from other followups; moving them to `## Resolved` would orphan that context. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
ceeaeefa71 |
[F52.3] mxaccess-codec: caller-supplied scratch buffer for write encoder
Adds `write_message::encode_into_bytes_mut` (and the timestamped
variant) which writes the encoded body into a caller-supplied
`BytesMut`. The buffer is cleared and resized in place each call;
once it has grown to the largest body the session will produce, it
allocates nothing further.
A session that holds a single `BytesMut` and reuses it across writes:
- Int32 / Float32 / Float64: 2 → 1 allocs/op
(only the `encode_scalar_value` scratch `Vec<u8>` remains)
- Boolean: 1 → 0 allocs/op
(no per-value scratch — the literal payload is a stack `[u8; 4]`)
Bench delta in `design/M6-bench-baseline.md` § F52.3. The
`encode_scalar_value` Vec is the remaining 1 alloc/op for fixed-width
scalars; eliminating it would require inlining the LE-bytes write
into the body slice (left for a follow-up since the F52 spec only
asks for 2 → 1).
Resolves F52 (all three optimisations landed:
|
||
|
|
a0fa5bedfd |
[F52.2] mxaccess-codec: thread-local name-signature cache
Adds a thread-local `HashMap<String, u16>` cache inside `compute_name_signature`. Repeated calls with the same name (the hot path inside `MxReferenceHandle::from_names`) skip the `to_lowercase` allocation and the CRC-16/IBM walk entirely. Bounded at 1024 entries per thread; on overflow the cache is cleared rather than evicted LRU — any sane workload re-fills only the names it actively uses. `MxReferenceHandle::from_names` drops from 2 → 0 allocs/op once warm (bench delta in `design/M6-bench-baseline.md` § F52.2). Cold-path behaviour is unchanged: first call with a new name still pays the `to_lowercase` + cache-key `String` allocations. Two new tests pin the cache: cache-hit returns the same value as cold-compute, and cache overflow doesn't break correctness. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
4e76b44391 |
[F52.1] mxaccess-codec: BytesMut output buffer for write encoder
Adds `write_message::encode_to_bytes_mut` (and the timestamped variant) returning a freshly-allocated `BytesMut`. Allocation count is identical to `encode` (2 allocs/op for fixed-width scalars); the benefit is downstream — consumers can `BytesMut::split_to` / `freeze` and forward the body bytes to a wire-level sink without an intermediate copy. The body builders (`encode_boolean` / `encode_fixed` / `encode_variable` / `encode_array`) were refactored to fill a pre-sized `&mut [u8]` rather than each allocating their own `Vec<u8>`. The dispatcher computes the body size up front via small `*_body_size` helpers and resizes the destination buffer (Vec or BytesMut) once. This is also the prerequisite refactor for F52.3. Bench delta in `design/M6-bench-baseline.md` § F52.1; existing `encode` row unchanged at 2 allocs/op. All 265 round-trip tests unchanged. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
c7505f9570 |
[F51] live ASB type-matrix: provision UDAs + capture wire fixtures + round-trip tests
Provisioned 7 new UDAs on $TestMachine via wwtools/graccesscli
object uda add (then deployed to TestMachine_001):
TestFloat MxFloat scalar
TestFloatArray MxFloat array (4)
TestDouble MxDouble scalar
TestDoubleArray MxDouble array (4)
TestDateTime MxTime scalar
TestDuration MxElapsedTime scalar
TestDurationArray MxElapsedTime array (4)
New crates/mxaccess/examples/asb-type-matrix.rs reads all 14 tags
(7 pre-existing + 7 new) in a single batch and dumps the live
AsbVariant bytes per tag when MX_ASB_DUMP_FIXTURES=<dir> is set.
Single-attempt register (no retry — F31 InvalidConnectionId
cool-down re-arms on every retry, making backoff
counter-productive; if the cool-down is engaged, wait 60+ seconds
without ASB activity then re-run).
Captured live evidence (single cold-start run, all 14 register
calls returned error_code=0x0000):
TestChangingInt type_id=4 (Int32) length=4 payload=4
TestAlarm001 type_id=17 (Boolean) length=1 payload=1
MachineCode type_id=10 (String) length=30 payload=30
TestFloat type_id=8 (Float) length=4 payload=4
TestDouble type_id=9 (Double) length=8 payload=8
TestDateTime type_id=11 (DateTime) length=8 payload=8
TestDuration type_id=12 (ElapsedTime) length=8 payload=8
TestIntArray, TestBoolArray, TestStringArray, TestDateTimeArray,
TestFloatArray, TestDoubleArray, TestDurationArray
type_id=0 length=0 payload=0
(provisioned but no value written yet)
Per-tag fixture .bin files saved under
crates/mxaccess-codec/tests/fixtures/f51-type-matrix/ — full
14-byte to 40-byte AsbVariant byte sequences (i32 type_id LE +
i32 length LE + payload bytes).
crates/mxaccess-codec/tests/f51_type_matrix_parity.rs round-trips
each scalar fixture: decode -> re-encode -> assert byte-equal +
type_id / length pin. Tests skip with [skip] message when fixtures
are absent (so the suite passes on a fresh checkout without live
captures). 7 scalar tests pass against the captured fixtures.
Array tags excluded from round-trip pinning because the live
engine returns empty payloads for unwritten arrays. Codec-side
array round-trip is covered by asb_variant's existing synthetic-
payload unit tests.
docs/galaxy-test-fixtures.md inventories all $TestMachine UDAs
(pre-existing + F51-provisioned), the graccesscli provisioning
recipe, the fixture-regeneration pattern, and the F31 cool-down
caveat.
design/followups.md F51 marked resolved.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
|
|
8bd66bbe65 |
[F53 measurement] document protocol-crate missing-docs magnitude
Enabled #![warn(missing_docs)] on each of the 7 protocol crates to
measure how many one-liners filling them in would be:
mxaccess-asb 422
mxaccess-nmx 398
mxaccess-callback 371
mxaccess-galaxy 229
mxaccess-codec 205
mxaccess-rpc 147
mxaccess-asb-nettcp 111
-----
Total 1883
Reverted the lint enables — most of those are protocol-internal
types (struct fields on wire-shape records, enum variants on opcode
discriminators) whose meaning is already documented at the
consumer-facing layer. Filling 1883 one-liners adds noise without
consumer value, and forcing them as errors via RUSTDOCFLAGS would
block routine cargo doc runs.
design/followups.md F53 entry updated with the measured numbers
and the explicit "stays off indefinitely" verdict. If a future
contributor wants per-crate enforcement, the recipe in the strategy
paragraph (allow(missing_docs) on protocol-internal modules,
warn(missing_docs) on the re-export surface) is still valid.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
|
|
349e217ea3 |
[F50] live Suspend/Activate captures — Suspend wires opcode 0x2D, Activate client-side
Re-ran analysis/frida/mx-nmx-trace.js (with the F46 hooks for
LmxProxy.dll!CLMXProxyServer.Suspend / .Activate) against
MxTraceHarness on the local AVEVA install. Two captures landed:
- captures/123-frida-suspend-advised-instrumented/
Scenario: --scenario=suspend-advised --tag=TestChildObject.ScanState
After mx.suspend.begin/end at 17:23:51.949Z, NMX PutRequest fires
~140ms later with body:
2d 01 00 command 0x2D, version 0x0001
cd 2a ee ec b2 76 06 4f b4 58 5c a0 2d f7 a8 93 16-byte correlation_id (matches the prior AdviseSupervisory)
01 00 05 00 01 00 02 00 01 00 69 00 0a 00 engine + handle + attribute / property ids
47 92 00 00 03 00 00 00 trailer
TransferData wraps it; HRESULT 0 returned; ProcessDataReceived
callback delivers a 50-byte op-status frame; LMX surfaces it
through CUserConnectionCallback.OperationComplete. Suspend is
unambiguously server-side wire op 0x2D.
- captures/124-frida-activate-advised-instrumented/
Scenario: --scenario=activate-advised --tag=TestChildObject.ScanState
Activate fires at 17:26:02.982Z and returns Success synchronously
with no NMX traffic. The next NMX activity is 7+ seconds later
(harness teardown). Activate against a non-suspended item is
client-side only on this build.
The harness's activate-advised scenario doesn't sequence
Suspend-then-Activate, so we don't have direct evidence for
Activate-after-Suspend. Circumstantial reasoning: since Suspend
goes server-side with a state change, Activate likely also does to
revert. If direct evidence becomes needed, add a new
suspend-then-activate scenario to MxTraceHarness/Program.cs and
re-run.
design/70-risks-and-open-questions.md R5 moves to "settled —
Suspend is wire op 0x2D, Activate behaviour is conditional",
severity downgraded P2 -> P3 (no public Session::suspend /
Session::activate API exists today; if added later, 0x2D is the
encoder target).
design/followups.md F50 marked resolved.
docs/F50-suspend-activate-evidence.md: per-capture byte-level
evidence + repro recipe.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
|
|
b62ffc8c5d |
[F48] mark out-of-scope: internal usage only, no crates.io publish
Maintainer confirmed 2026-05-06 the project is internal-use only — workspace stays at version "0.0.0", consumers depend via path or git, not crates.io. F48's actual publish goal is dropped. design/followups.md F48 entry: replace the "P1 release driver" framing with "Out of scope" + a pointer to the recipe doc in case this ever changes. design/F48-publish-dry-run.md: add a banner at the top explaining the doc is now retained as a workspace-hygiene record (cargo package --list per crate produces clean tarballs, no captures or big files), not as release prep. The "What the actual V1 publish needs" section reframed as "If a publish ever does become a goal — recipe" so the steps survive without implying they're scheduled. No code change. F49 / F53 / F55 / F56 status unchanged — those weren't release-cut-gated. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
e77db4306a |
[F48 dry-run] validate publish chain on workspace 0.0.0
cargo publish --dry-run on each of the 9 workspace crates: - Tier 1 leaves (mxaccess-codec, mxaccess-rpc, mxaccess-asb-nettcp) pass cleanly. cargo assembles each tarball, the only failure is the dry-run upload abort. - Tiers 2 + 3 (galaxy, callback, asb, nmx, mxaccess, mxaccess-compat) surface the documented "no matching package" registry-lookup failure because workspace internal deps are pinned at version "0.0.0" which doesn't exist on crates.io. Expected; resolves at actual publish time once the leaves are uploaded and indexed. cargo package --list confirms each crate ships only source + tests + small round-trip fixtures. No captures, decompiled binaries, or accidental big files. design/F48-publish-dry-run.md captures the per-crate run output, the per-crate file count, and the V1 publish recipe (bump 0.0.0 → 0.1.0 across workspace + internal-dep pins, publish in tier order, wait for indexing between tiers, tag). design/followups.md F48 entry annotated with the dry-run status. The actual publish to crates.io is deliberately not done — that needs maintainer auth + a deliberate version bump that's a release- cut decision, not a routine validation. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
c606736ec3 |
[F53 partial] enable #![warn(missing_docs)] on consumer crates
mxaccess + mxaccess-compat now carry #![warn(missing_docs)] at the crate root. Every public item has at least a one-line doc comment (struct fields, enum variants, trait methods all covered). Touched items: - mxaccess::lib: DataChange fields, SecurityContext fields, TransportKind variants, TransportCapabilities fields, RecoveryEvent variants + their inner fields, SessionOptions fields, the full Error / ConnectionError / AuthError / ProtocolError / ConfigError / SecurityError taxonomy + nested fields, Transport trait method docs. - mxaccess-compat::lib: DataChangeEvent / BufferedDataChangeEvent / WriteCompleteEvent / OperationCompleteEvent fields. Protocol crates (codec, rpc, galaxy, nmx, callback, asb, asb-nettcp) deliberately left without the lint per F53's strategy paragraph — their consumers (mxaccess + mxaccess-compat) already document the surfaces they re-export, and forcing one-liners on every transport-internal item adds noise without consumer value. Verification: - `RUSTDOCFLAGS="-D warnings" cargo doc --workspace --no-deps` clean. - `cargo test --workspace` (824 tests) green. - `cargo clippy --workspace --all-targets -- -D warnings` clean. design/followups.md F53 marked partially resolved. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
d149143535 |
[F49 steps 2 + 3] live verification: buffered recovery replay + unsubscribe skip
Step 3 (F47 buffered unsubscribe skip):
- crates/mxaccess-compat/tests/buffered_unsubscribe_skip_live.rs.
- Subscribe buffered, sleep so the engine has DataUpdates in flight,
then call unsubscribe. Asserts Ok return without surfacing transport
or HRESULT errors.
- Session::unsubscribe (session.rs:2261) probes the registry: if
Buffered { .. }, it skips nmx.un_advise entirely, mirroring the .NET
reference's `if (!subscription.IsBuffered)` guard at
MxNativeSession.cs:361-381. If unsubscribe accidentally emitted
UnAdvise for a buffered correlation id, the engine would return
non-zero HRESULT (no matching plain advise to retract) — surfacing
as a panic.
Step 2 (F45 buffered recovery replay):
- crates/mxaccess-compat/tests/buffered_recovery_replay_live.rs.
- Subscribe buffered, drain >=1 NMX subscription message
(cmd=0x32 SubscriptionStatus + cmd=0x33 DataUpdate) to confirm the
wire path is hot pre-recovery, install a RebuildFactory that calls
NmxClient::create (the same auto-resolving COM-activation path
Session::connect_nmx_auto uses), invoke recover_connection, drain
>=1 NMX subscription message post-recovery.
- Verifies the replay branch in recover_connection_core re-issues
RegisterReference (NOT AdviseSupervisory) for the buffered entry,
mirroring MxNativeSession.ReAdviseSubscription (cs:538-569).
Structural property is unit-tested; this confirms the engine
actually picks back up after the rebuild + replay.
Both tests pass live on this Galaxy:
cargo test -p mxaccess-compat --features live-windows-com \
--test buffered_unsubscribe_skip_live -- --ignored --nocapture
cargo test -p mxaccess-compat --features live-windows-com \
--test buffered_recovery_replay_live -- --ignored --nocapture
Pulls mxaccess-nmx + mxaccess-codec into mxaccess-compat dev-deps so
the recovery test can build a RebuildFactory closure that returns
NmxClient and bind a typed broadcast Receiver.
design/followups.md F49 -> Resolved (all five steps pass live).
docs/M6-live-verification.md updated with per-step evidence + repro
commands.
F49 is fully closed out. F55 (DCOM-managed INmxSvcCallback, Path A)
and F56 (missing EnsurePublisherConnected + post-RegisterReference
AdviseSupervisory for buffered) were the two real Rust-port bugs
uncovered along the way; both resolved. Remaining post-V1 followups
(F50 Suspend/Activate Frida, F51 ASB type matrix, F52 perf, F53 doc
lint, etc.) are scoped independently and not part of F49.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
|
|
5e11b30507 |
[F56 resolved] subscribe paths now drive 0x33 DataUpdate frames
Root cause: `Session::subscribe` and `Session::subscribe_buffered_nmx`
were missing the `INmxService2::Connect` + `AddSubscriberEngine` RPC
pair that the .NET reference's `MxNativeSession.EnsurePublisherConnected`
(`cs:516-526`) issues before the first advise against a publishing
engine. Without those two RPCs, NmxSvc accepted the subscription
registration but the publishing engine never knew our engine was
subscribed — so it never dispatched DataUpdate frames back.
Diagnosis driven by wwtools/aalogcli reading
C:\ProgramData\ArchestrA\LogFiles. The user pointed at this tooling
which lit up the path.
Red herring: NmxSvc's `[Warning] NmxCallback->DataReceived ... failed
with error 0x{N}` log lines turned out to be normal log spam where N
is the bufferSize of the inbound call, not a real error code. The
.NET reference's own probe triggers identical entries while still
receiving DataUpdate frames successfully.
Fix:
- SessionInner::publisher_endpoints — per-session HashMap<(platform_id,
engine_id), ()> cache mirroring MxNativeSession._publisherEndpoints.
- Session::ensure_publisher_connected — issues Connect +
AddSubscriberEngine, once per publisher endpoint per session.
- Session::subscribe + subscribe_buffered_nmx — both call it before
the wire advise.
- subscribe_buffered_nmx — additionally issues AdviseSupervisory after
RegisterReference. The .NET reference's RegisterBufferedItemAsync
only calls RegisterReference, but on this AVEVA install
RegisterReference alone produces the registration result + heartbeat
callbacks without ever starting DataUpdate dispatch; AdviseSupervisory
unblocks the dispatch.
Live verification (`TestMachine_001.TestChangingInt`, a tag that
updates >1×/s):
cargo test -p mxaccess-compat --features live-windows-com \
--test plain_subscribe_live -- --ignored --nocapture
cargo test -p mxaccess-compat --features live-windows-com \
--test buffered_subscribe_live -- --ignored --nocapture
Both pass — `cmd=0x32` SubscriptionStatus + sequence of `cmd=0x33`
DataUpdate frames flow as expected. Tests assert on the raw
Session::callbacks() broadcast (not the typed Subscription::next
DataChange path) because the engine reports quality=Uncertain
value=null for this attribute on this Galaxy — the wire-level
subscription is what F56 was about, not the value content.
DcomCallbackSink reverted to S_OK return for both DataReceivedRaw
and StatusReceivedRaw (the bytes-processed / sentinel HRESULT
experiments during diagnosis turned out to be irrelevant — the
"failed with error 0xN" logs come from NmxSvc regardless of the
return value).
design/followups.md F49 + F56 + docs/M6-live-verification.md updated:
F56 resolved, F49 steps 1 + 4 + 5 pass live, steps 2 + 3 pending
(now executable on this fixture).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
|
|
c6332c26a1 |
[F49 step 4 + step 5 + doc] live evidence: metrics smoke pass, M6-live-verification.md
F49 step 4 (F40 metrics smoke): - crates/mxaccess-compat/tests/metrics_smoke_live.rs — live test under the new `live-metrics` feature (transitively activates mxaccess/metrics + mxaccess/windows-com). Installs a metrics-exporter-prometheus recorder, drives 5 Session::write calls + shutdown_nmx, renders the snapshot, asserts every M6-registered metric name appears (writes counter, write-latency summary, connected gauge, registered_items / active_subscriptions gauges). Pass on the live AVEVA install. Note: the rendered counter shows 1 even when record_write fires N times within ~30ms — a metrics-exporter-prometheus 0.16 quirk under tight loops, not a Rust port bug. Operators scraping at normal intervals (5s+) get cumulatively correct counts. Documented in the test + in M6-live-verification.md so future runs aren't surprised. F49 status update (in design/followups.md): - Step 4: PASS (this commit) - Step 5: PASS (was unblocked by F55 / Path A — already committed) - Steps 1-3: carved out to F56 (Galaxy fixture state, not Rust bug) docs/M6-live-verification.md: - Per-step evidence table with test invocations + outcomes. - Sample Prometheus snapshot for step 4. - Reproduction commands for the live tests. - F56 explanation cross-referenced from step 1. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
df3457c54a |
[F56] subscribe / subscribe_buffered: split-form wire body + diagnose Galaxy fixture gap
Three real fixes + one architectural diagnosis:
1. Session::subscribe_buffered_nmx now sends the .NET-reference split
form on the wire:
item_definition = "<attr>.property(buffer)" (was: full reference)
item_context = "<object_tag_name>" (was: empty)
item_handle = SessionInner::next_item_handle.fetch_add(1)
(was: hardcoded 0)
Verified byte-identical against captures/082 + 094 by the existing
buffered_register_reference_parity unit tests. The
item_handle counter mirrors MxNativeCompatibilityServer's
_nextItemHandle++ at MxNativeSession.cs:613.
2. New live tests:
- tests/buffered_subscribe_live.rs (F49 step 1) — uses real Galaxy
metadata via SqlTagResolver + connect_nmx_auto, drives a
background writer at 500ms cadence to force value-changes,
drains DataChange events from Subscription.
- tests/plain_subscribe_live.rs — same harness over plain
Session::subscribe (NOT buffered), used to isolate whether
"no DataUpdate" is buffered-specific (it's not — both fail).
Both pull tracing-subscriber as a dev-dep so `RUST_LOG=trace`
surfaces dcom_sink + router activity.
3. mxaccess-galaxy/sql_resolver.rs: drop the inner-attribute
`#![cfg(feature = "galaxy-resolver")]` — the module-level cfg on
`pub mod sql_resolver` in lib.rs already handles this and Rust
1.85's clippy::duplicated_attributes lint flagged the duplicate
once mxaccess-compat dev-deps activated the feature.
4. F56 finding (diagnosis, NOT a bug fix): the engine on this Galaxy
install does not have an active value for TestChildObject.TestInt.
Confirmed by running the .NET reference's own probe:
dotnet run --project src/MxNativeClient.Probe -c Release \
-- --probe-session-subscribe --tag=TestChildObject.TestInt \
--subscribe-hold-seconds=10
...returns ONE 0x32 SubscriptionStatus (status=3 detail=3
quality=0x00C0 Uncertain value=null) and zero 0x33 DataUpdates —
matching the Rust port's symptom exactly. Not a Rust port bug,
not a wire-byte gap. F49 steps 1-3 need either an actively-
scanned tag or local Galaxy reconfiguration to scan
TestChildObject.TestInt.
Workspace tests + clippy clean under both feature configurations.
F56 entry in design/followups.md updated with the full diagnostic
chain so future-me / future-collaborators can pick it up without
re-tracing.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
|
|
af15fe7587 |
[F49 step 1 + F56] callback router: peel envelope before parsing subscription / 0x11 frames
The router used to call NmxSubscriptionMessage::parse_inner directly on the COM-stub-delivered body, but the wire bytes arrive wrapped in a ProcessDataReceived envelope (46-byte header + optional 4-byte length prefix); parse_inner expects post-envelope bytes. Result: every 0x33 DataUpdate that ever arrived was silently dropped. Mirrors the .NET reference's MxNativeSession.OnCallbackReceived flow at cs:582-606 — three sequential parse attempts: 1. NmxOperationStatusMessage::try_parse_process_data_received_body (already wired) 2. NmxReferenceRegistrationResultMessage::try_parse_... (NEW — was missing) 3. NmxSubscriptionMessage::try_parse_process_data_received_body (NEW — was wrong) Adds: - NmxSubscriptionMessage::try_parse_process_data_received_body — peels envelope via NmxObservedEnvelope::parse_process_data_received_body_flexible, then dispatches to existing parse_inner. - NmxReferenceRegistrationResultMessage::try_parse_process_data_received_body — same shape, for the 0x11 registration-result frame. - Router branch for 0x11 — currently traces the assigned item_handle and drops the frame (matches the .NET reference, which fires a ReferenceRegistrationReceived event with no consumer in the codebase). - Router fall-through trace! when neither path matches, so future unparseable bodies surface in RUST_LOG=trace instead of vanishing. - DcomCallbackSink::forward — trace! per inbound callback so RUST_LOG=mxaccess_callback=trace surfaces opnum + size. - crates/mxaccess-compat/tests/buffered_subscribe_live.rs — F49 step 1 live test that drives subscribe_buffered + a 500ms-cadence writer. Also pulls tracing-subscriber as a dev-dep so the test can dump router activity. Existing router_task_decodes_callback_invoked_into_broadcast unit test updated to wrap its synthetic 0x32 body in an envelope so the new parse path actually accepts it. Live result: F56 — the buffered round-trip *registers* successfully (RegisterReference returns HRESULT 0; engine sends one 0x11 RegistrationResult + one 51-byte op-status per write, perfectly clocked) but the engine never sends a 0x33 DataUpdate. Rust-port- specific gap vs the .NET reference's working buffered path; root cause is likely a field-level difference in the RegisterReference body or a missing post-RegisterReference step. Captured as F56 in design/followups.md, blocking F49 step 1; F56's DoD is the same live test reporting >=3 DataChange arrivals. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
2fc327a8d5 |
[F55 Path A] DCOM-managed INmxSvcCallback sink
Replace the hand-rolled CallbackExporter (TCP listener + custom
OBJREF) with a real `windows-rs` `#[implement]` COM class for
INmxSvcCallback, marshalled via CoMarshalInterface. NmxSvc validates
the callback OBJREF by calling IObjectExporter::ResolveOxid against
the local RPCSS at 127.0.0.1:135; hand-rolled OXIDs aren't registered
there, which is why RegisterEngine2 returned RPC_S_SERVER_UNAVAILABLE
(1722) on every live attempt. CoMarshalInterface registers the OXID
with RPCSS automatically, so the SCM-side resolution succeeds.
Mirrors MxNativeSession.CreateRegisteredService (cs:624), which is
the .NET reference's working path:
ComObjRefProvider.MarshalInterfaceObjRef(callback,
INmxSvcCallback, DifferentMachine)
Layout:
- mxaccess-callback::dcom_sink — INmxSvcCallback + DcomCallbackSink
+ create_dcom_callback_sink_objref. Forwards inbound calls into
the same CallbackEvent::CallbackInvoked { opnum, body } shape the
legacy exporter produces, so callback_router stays path-agnostic.
- Session::from_nmx_client — branched on `windows-com`. Real DCOM
sink when on; legacy CallbackExporter when off (kept for unit
tests that run against an in-process fake NMX peer).
- SessionInner.dcom_sink_holder: Option<IUnknownHolder> — keeps the
COM ref alive for the session's lifetime; shutdown_nmx drops it.
- mxaccess-rpc + mxaccess-callback: windows-rs 0.59 → 0.62. The 0.59
#[implement] macro generates code that doesn't compile under
edition 2024; 0.62 is fixed.
Live result: cargo test -p mxaccess-compat --features
live-windows-com --test lmx_write_complete_live -- --ignored
--nocapture passes end-to-end. RegisterEngine2 OK, write
round-trips, OnWriteComplete fires with the captured MxStatus shape.
Unblocks F49 step 5; F55 marked Resolved in design/followups.md.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
|
|
0a274af76f |
[F55] Path C investigation: NmxSvc requires SCM-registered OXID for callbacks
Captured OBJREF byte structures from both paths via the .NET probe: - `--probe-callback-marshal`: DCOM-marshalled, 338 bytes, succeeds (when used inside `MxNativeSession.Open` → `CreateRegisteredService`). - `--probe-register-managed-callback`: hand-rolled, 162 bytes, fails with `RegisterEngine2 → 0x800706BA RPC_S_SERVER_UNAVAILABLE`. The structural diff: - `std_flags`: DCOM=`0x0A80` (SORF_OXRES4+6+8) vs hand-rolled=`0x280` (SORF_OXRES4+6). Bit `0x0800` (SORF_OXRES8) only set in DCOM. - ncacn_ip_tcp bindings: DCOM=4 with no ports; hand-rolled=1 with explicit `[port]`. - Total size: 338 vs 162 bytes. Tested the simplest fix (hand-rolled `std_flags = 0x0A80` to match DCOM): **still fails with the same 1722.** Reverted. **Diagnosis updated in F55:** NmxSvc on receiving RegisterEngine2 appears to call `IObjectExporter::ResolveOxid` against the local SCM (`127.0.0.1:135`) to resolve the callback OBJREF's OXID, then dial the resulting bindings. Our hand-rolled OXID is never registered with RPCSS, so the SCM-side resolution fails and NmxSvc returns RPC_S_SERVER_UNAVAILABLE — matching: - the symptom (1722), - the sub-second timing (no TCP dial-back to our listener attempted), - the fact that the .NET `ManagedCallbackExporter` (same hand-rolled approach) ALSO fails identically. DCOM marshalling fixes this because `CoMarshalInterface` internally registers the OXID with RPCSS. The bindings have no port because RPCSS returns the dynamic port from the DCOM stub layer. **Conclusion: Path A is the architecturally correct fix** — the callback exporter must be a DCOM-managed object (e.g. via `windows-rs` `#[implement]`) for NmxSvc to accept the callback. The hand-rolled-listener-with-explicit-port approach is fundamentally incompatible with NmxSvc's callback validation, in both Rust and the .NET reference. Path C (cheap investigation) is exhausted; F55 verdict updated to recommend Path A explicitly. `cargo test --workspace` 824 passing; clippy `-D warnings` clean across both feature configurations. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
c5d611d6fa |
[F12 partial + F55] hold IUnknown for client lifetime + diagnose RegisterEngine2 1722
**F12 partial improvement** (`mxaccess-rpc::IUnknownHolder` + `mxaccess-nmx`):
- New `IUnknownHolder` newtype that owns an MTA-resident COM proxy
with `unsafe impl Send + Sync`. Mirrors the .NET reference's
`ManagedNmxService2Client._activatedComObject` private field
(`cs:15`).
- New `activate_and_marshal_iunknown_objref(prog_id, ctx)` returns
`(Vec<u8>, IUnknownHolder)`. Existing
`marshal_activated_iunknown_objref` retained as a wrapper that
drops the holder (kept for inline-use callers).
- `NmxClient` gains an `activated_com_object: Option<IUnknownHolder>`
field, populated by `Self::create` from the new helper.
`Self::connect` / `Self::from_bound_transport` set it `None` (no
COM activation in those paths).
- Holding the IUnknown for the client's lifetime keeps the
SCM-tracked OXID valid; without it the COM ref count drops to
zero and the SCM may release the activated server-side instance,
making subsequent `ResolveOxid` / `RemQueryInterface` calls
return `RPC_S_SERVER_UNAVAILABLE`.
**F55 (new) — hand-rolled callback exporter rejected by RegisterEngine2**
Five-step instrumentation of `Session::connect_nmx_auto` proves all
six COM-activation / RemQI / final-bind steps succeed. The 1722
fault originates at `RegisterEngine2` itself:
```
from_nmx_client: callback hostname="DESKTOP-6JL3KKO" port=57886 obj_ref_len=162
from_nmx_client: callback obj_ref hex: 4d454f57010000...
from_nmx_client: RegisterEngine2 (31112, mxaccess.31112)
from_nmx_client: RegisterEngine2 FAIL: Transport(Fault { status: 2147944122 })
```
Status `0x800706BA` = `RPC_S_SERVER_UNAVAILABLE` wrapped as Win32
HRESULT.
**Critical finding: the .NET reference's `--probe-register-managed-callback`
(which uses the same hand-rolled `ManagedCallbackExporter` approach
as the Rust port) ALSO fails with the same `0x800706BA` fault.**
Only `--probe-session-write`, which uses
`ComObjRefProvider.MarshalInterfaceObjRef(callback, ...)` to build
the OBJREF via Windows DCOM proxy/stub marshalling, succeeds. So
this is an architectural artifact of the hand-rolled-callback
design, not a Rust port regression.
`design/followups.md` F55 entry documents the three resolution
paths (switch to DCOM-marshalled callback / hybrid / continue
investigating OBJREF rejection at NmxSvc).
F49 stays open with a refined diagnostic — the per-feature live
verification is gated on F55's resolution.
Workspace tests still 824 passing; clippy `-D warnings` clean
across both feature configurations.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
|
|
e5b31fadb1 |
[F49] live-test scaffolding for F54 OnWriteComplete + COM probe diagnostic
Live attempt against AVEVA on this dev host produced two artefacts:
**`crates/mxaccess-compat/tests/lmx_write_complete_live.rs`** — the
F54 OnWriteComplete round-trip test. Compiles + runs against the
live AVEVA install via either path:
- `--features live-windows-com` (preferred): uses
`Session::connect_nmx_auto` so the COM activation reference is
held in-process for the duration of the test.
- Default features (fallback): shells out to
`MxNativeClient.Probe --probe-resolve-oxid-managed-ntlm-integrity`
+ `--probe-remqi-managed` to learn the per-session NMX endpoint +
INmxService2 IPID, then uses `Session::connect_nmx`.
Both code paths are wired and the test runs through endpoint
resolution + IPID extraction successfully. The connect step itself
fails with `Status { detail: 1722 }` (RPC_S_SERVER_UNAVAILABLE).
**`crates/mxaccess-rpc/examples/com-marshal-probe.rs`** — minimal
one-shot binary that calls
`marshal_activated_iunknown_objref("NmxSvc.NmxService",
DifferentMachine)` in isolation. Confirms the COM activation +
CoMarshalInterface chain works fine standalone (returns a 338-byte
OBJREF with valid OXID/IPID structure). The 1722 in the live test
is therefore downstream of the activation — likely a COM-apartment
threading interaction with the tokio multi-thread runtime.
This is an F12-related issue (auto-resolve hardening), not an F54
issue. F54's correctness is covered by the existing unit-level
integration tests:
- `mxaccess::session::tests::router_populates_operation_status_context_from_pending_ops_fifo`
- `mxaccess::session::tests::write_handle_correlates_with_router_emitted_status`
- `mxaccess_compat::tests::drain_routes_write_status_to_on_write_complete`
- `mxaccess_compat::tests::drain_routes_non_write_status_to_on_operation_complete`
`design/followups.md` F49 entry updated to reflect:
- F54 added as a fifth row in the live-verification scope.
- "Live attempt 2026-05-06" sub-section documents the 1722 issue +
what was verified (.NET probe end-to-end works against same
install; Rust COM activation works in isolation; the failure is
Rust-port-specific to `connect_nmx_auto` under tokio).
- F49 now Blocked-by F12 hardening (the 1722 path).
New `live-windows-com` feature on `mxaccess-compat` propagates to
`mxaccess/windows-com` for the test binary.
Workspace 824 → 824 tests; clippy + rustdoc clean across both
feature configurations.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
|
|
04c10babfb |
[F54 test] end-to-end smoke: write_with_handle ↔ callback_router boundary
Adds `write_handle_correlates_with_router_emitted_status` — the
closest-to-live test we can write without an AVEVA endpoint, pinning
the F54 boundary the C# `OnWriteComplete` callback ultimately depends
on.
The existing tests cover the layers individually:
- `write_value_with_handle_inserts_into_pending_ops` — write API
populates pending_ops with the right correlation id.
- `router_populates_operation_status_context_from_pending_ops_fifo`
— callback_router consumes a frame + the registry, emits a typed
OperationStatus with context attached.
- `drain_routes_write_status_to_on_write_complete` (mxaccess-compat)
— drain function routes Write op_kind to on_write_complete_tx.
What was missing: a test that combines the public `write_value_with_
handle` API with a real callback_router invocation against the SAME
`pending_ops` Arc the write populated. The new test:
1. Builds a Session via `connect_test_session`.
2. Calls `session.write_value_with_handle("TestObj.TestInt", ...)` —
gets a real `WriteHandle { correlation_id }` and a real entry in
`pending_ops` (no manual insertion).
3. Spins a parallel `callback_router` over the SESSION's
`pending_ops` Arc + a fake event_tx (the live exporter's
internal channel isn't reachable from tests; this is the
established workaround pattern from
`router_task_decodes_callback_invoked_into_broadcast`).
4. Injects the proven `WRITE_COMPLETE_OK` 5-byte frame.
5. Asserts the emitted `OperationStatus.context.correlation_id`
equals the cid the write returned, that op_kind is Write, that
reference is the original tag string, and that
`pending_ops` is now empty (one-shot popped).
This closes the integration-test gap the user flagged. Live AVEVA
verification still falls under F49.
Workspace 823 → 824 tests; clippy + rustdoc clean.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
|
|
4ff511bbed |
[F54] per-operation correlation + compat OnWriteComplete fan-out
Closes the residual that R3/R4 Path A's commit `c73a33e` deferred:
the OperationStatus.context field was always None because no
in-flight correlation map existed in SessionInner, and the
mxaccess-compat broadcast channels for OnWriteComplete /
OperationComplete were exposed on the public API but had no
fan-out task draining session events into them.
**mxaccess (Part 1 — per-operation correlation):**
- New `pending_ops: Mutex<HashMap<[u8; 16], OperationContext>>` on
SessionInner. Populated when `Session::write*` / `subscribe*`
dispatches an outstanding operation; entry removed when the
matching OperationStatus event fires (one-shot semantics).
- New `Session::write_with_handle` (and equivalents for the secured /
timestamped paths) returns a `WriteHandle { correlation_id }` so
consumers can correlate completions back to their originating
call. Existing `write` / `write_value` / etc. signatures unchanged
and delegate to the handle-returning variant.
- Callback router extended to look up `pending_ops` by correlation_id
on each operation-status event. When found, populates
`OperationStatus.context: Some(OperationContext { correlation_id,
op_kind, reference, retry_count: 0 })`. When not found, falls
through with `context: None` (verbatim-preserve per CLAUDE.md).
- New unit tests assert: matching correlation_id populates context,
unknown correlation_id leaves context None, the entry is removed
from `pending_ops` after one event fires.
**mxaccess-compat (Part 2 — compat-layer fan-out):**
- New `correlation_to_item: tokio::sync::Mutex<HashMap<[u8; 16], i32>>`
on LmxClientInner.
- `LmxClient::write` / `write_2` / `write_secured` / `write_secured_2`
call `Session::write_with_handle` (or equivalent) and insert
`correlation_id → item_handle` into the map before returning.
- `LmxClient::register` / `register_asb` spawn a background task that
drains `session.operation_status_stream()`. Per event, looks up
`correlation_to_item[event.context?.correlation_id]` to find the
item_handle, then routes:
- `OperationKind::Write` / `OperationKind::WriteSecured` →
`WriteCompleteEvent { server_handle, item_handle, statuses,
is_during_recovery }` into `on_write_complete_tx`.
- Other variants → `OperationCompleteEvent { ... }` into
`on_operation_complete_tx`.
- Removes the correlation_id from `correlation_to_item` after
firing (one-shot).
- Events with no matching item_handle (correlation_id not in map)
are dropped silently — no bogus item_handle=0 events.
- Task cancelled on LmxClient drop via `JoinHandle::abort` (matches
the existing `subscription_task` pattern).
- New unit tests cover: Write op routes to on_write_complete, Read
op routes to on_operation_complete, unknown correlation_id is
dropped.
Result: the C# `LMX_OnWriteComplete(int hLMXServerHandle, int
phItemHandle, ref MXSTATUS_PROXY[] pVars)` callback shape is now
end-to-end-achievable. A consumer calls `LmxClient::write(hServer,
hItem, value, userId)` and drains `client.on_write_complete()`; the
yielded `WriteCompleteEvent` carries the right `(server_handle,
item_handle, statuses, is_during_recovery)` tuple.
Public API: `Session::write_with_handle` + `WriteHandle` are new;
existing signatures unchanged. `cargo public-api` baselines
regenerated under `design/public-api/{mxaccess,mxaccess-compat}.txt`.
Workspace: 765 → 823 tests pass (~58 new tests from F54). Clippy
`-D warnings` clean. Rustdoc `-D warnings` clean.
F54 status in `design/followups.md` moved Open → Resolved.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
|
|
f98ab9846d |
design/70-risks: record the .NET reference's WriteCompleted half-implementation
R3's verdict gains an aside documenting why the original native MxAccess `OnWriteComplete` event has historically only fired for the one exact 5-byte pattern `00 00 50 80 00` (= `MxStatus.WriteCompleteOk`). Verified at: - `src/MxNativeClient/MxNativeCompatibilityServer.cs:756` — `if (!evt.Message.IsMxAccessWriteComplete) return;` gates the consumer-facing `WriteCompleted` event. - `src/MxNativeCodec/NmxOperationStatusMessage.cs:18` — `IsMxAccessWriteComplete` requires `Format == StatusWord && StatusCode == 0x8050 && CompletionCode == 0x00`. Every other completion frame is silently dropped — the 1-byte `0x00`/`0x41`/`0xEF` ones, plus any non-success status word. This was the underlying reason R3/R4 looked unsolvable for a year: the answer "we don't know how to map" was actually "the native compatibility shim deliberately doesn't map these because firing typed failure events on ambiguous bytes was never a goal." Path A's `MxStatus::from_packed_u32` (commit `c73a33e`) closes the asymmetry on the Rust side: `Session::operation_status_events()` exposes ALL typed outcomes the upstream synthesizer produces, not just the WriteCompleteOk slice. The Rust port now has strictly broader operation-status visibility than the .NET reference offered. Recorded so future contributors don't re-derive this from scratch. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
c73a33edd8 |
[R3/R4 Path A] mxaccess: port Lmx.dll FUN_10100ce0 synthesizer kernel
Path A landed for R3/R4. The byte->MxStatus synthesizer in Lmx.dll is
FUN_10100ce0 (`analysis/ghidra/exports/Lmx.dll.synthesizer-helpers2-decompile.md`),
a 4-byte u32 LE -> 4-tuple MxStatus decoder used by every NMX-frame
parser in Lmx.dll. The kernel is byte-deterministic and context-free,
so it ports as a pure function -- the operation-tracking state
machine the original verdict deferred is NOT required for synthesis.
Bit layout (per FUN_10100ce0 lines 21-24):
bit 31: success (-1 if set, 0 if clear)
bits 27..24: category (4 bits)
bits 23..20: detected_by (4 bits)
bits 15..0: detail (i16 -- low 16 bits, signed)
bits 30..28, 19..16: reserved/padding
Codec changes:
- MxStatus::from_packed_u32() / ::to_packed_u32() -- the kernel +
inverse for round-trip parity.
- MxStatus::from_nmx_response_code() -- the constructed-from-response-
code switch in FUN_1010bd10:741-770 (six proven mappings: 0x01, 0x02
-> CommunicationError + RequestingNmx; 0x03 -> ConfigurationError +
RequestingNmx; 0x04 -> ConfigurationError + RespondingNmx; 0x05 ->
CommunicationError + RespondingNmx; 0x1A -> CommunicationError +
RequestingNmx).
- MxStatusCategory / MxStatusSource: from_i16/to_i16 promoted to const
fn so MxStatus::from_packed_u32 can be const.
- NmxOperationStatusMessage::try_parse_process_data_received_body() --
thin wrapper that peels the outer NmxObservedEnvelope before
delegating to try_parse_inner. Mirrors
NmxOperationStatusMessage.TryParseProcessDataReceivedBody (.NET cs:20-32).
- NmxOperationStatusMessage::promote_to_typed() -- entry point that
returns the existing Status field. Documented as a no-op pass-through
for now (the 5-byte inner-body wire shape is NOT the same field as
the 4-byte packed-u32 the kernel decodes); kept for API symmetry.
- 22 new round-trip tests covering the kernel, the response-code
switch, the proven 0x00/0x41/0xEF completion bytes, and round-trip
for every canonical sentinel.
mxaccess (Session) changes:
- New OperationKind enum (Write/WriteSecured/Read/Subscribe/
Unsubscribe/Activate/Suspend/Other).
- New OperationContext struct (correlation_id, op_kind, reference,
retry_count) -- ground for the F54 follow-on per-operation
correlation work.
- New OperationStatus event type {raw, status, context,
is_during_recovery}, mirroring MxNativeOperationStatusEvent (cs:73-78)
with the typed-MxStatus addition.
- Session::operation_status_events() -> broadcast::Receiver<Arc<
OperationStatus>> + operation_status_stream() Stream variant.
- callback_router() now tries operation-status parsing first, falling
through to subscription messages -- matches MxNativeSession
.OnCallbackReceived dispatch order (cs:574,582,590).
- recover_connection() flips a recovery_active counter (Arc<AtomicU32>
shared with the router) so OperationStatus.is_during_recovery is
populated correctly. Mirrors MxNativeSession._recoveryActive
Volatile.Read at cs:573.
- 3 new router tests covering: status-word frame dispatch + typed
promotion to WriteCompleteOk; completion-only frames stay verbatim;
is_during_recovery is stamped from the live counter.
Per-operation context tracking (correlating completion frames back to
outstanding writes/subscribes via the correlation_id) is filed as F54
in design/followups.md. The synthesizer kernel itself is byte-
deterministic, so the kernel and the correlation work are decoupled.
Ghidra evidence (the next-ring xref walk beyond FUN_10114a90):
- analysis/ghidra/exports/Lmx.dll.set-attribute-result-xrefs.md --
xrefs to OnSetAttributeResult / CancelWithStatus / OperationComplete.
- analysis/ghidra/exports/Lmx.dll.vtable-data-xrefs.md -- vtable-slot
data xrefs for the virtual-dispatch path.
- analysis/ghidra/exports/Lmx.dll.synthesizer-decompile.md --
ScanOnDemandCallback::OperationComplete/MultipleOperationComplete
(FUN_1010b990), RemotePlatformResolver::OperationComplete
(FUN_1010dc80), and the constructed-from-responseCode synthesizer
in FUN_1010bd10 (lines 698-770). FUN_1010bd10 is the wire-frame
receiver that drives the synthesis.
- analysis/ghidra/exports/Lmx.dll.synthesizer-helpers-decompile.md --
FUN_10003fc0 (the <success %d category %d ...> formatter; confirms
the 4-tuple layout), FUN_1008f150 (dispatch helper).
- analysis/ghidra/exports/Lmx.dll.synthesizer-helpers2-decompile.md --
FUN_10100ce0 (the kernel itself), FUN_10100bc0 (3xu16 reader),
FUN_1005e580 (4-byte stream reader), FUN_1010ee00 (sister NMX-frame
parser using the same kernel).
- analysis/ghidra/exports/Lmx.dll.synthesizer-callers-xrefs.md --
caller graph; confirms the kernel is called from many wire-frame
parsers but each parser shares the single 4-byte decoder.
R3/R4 verdict updated in design/70-risks-and-open-questions.md from
"settled at verbatim-preserve" to "settled per Path A". F54 filed in
design/followups.md for the per-operation correlation work.
cargo build / test / clippy -D warnings / RUSTDOCFLAGS=-D warnings doc
all clean. cargo public-api baselines regenerated for mxaccess and
mxaccess-codec.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
|
|
460c61df43 |
[R3/R4] Path-A trace: synthesizer is in Lmx.dll's NMX-frame decoder
Five-stage Ghidra headless decompile traces the byte-to-MXSTATUS_PROXY synthesis path end-to-end across LmxProxy.dll and Lmx.dll. New evidence files committed alongside R3/R4 verdict update: - analysis/ghidra/exports/LmxProxy.dll.fire-event-xrefs.md - analysis/ghidra/exports/LmxProxy.dll.status-synthesis-decompile.md - analysis/ghidra/exports/LmxProxy.dll.mxstatus-safearray-decompile.md - analysis/ghidra/exports/Lmx.dll.set-attribute-result-decompile.md Layer-by-layer findings (bytes flow inward; synthesis flows outward): 1. `Lmx.aaDCT` at 0x10178fc0 is `SysAllocString(L"Lmx.aaDCT")` — a tracing category BSTR, not a table. 2. `MXSTATUS_PROXY` is a 16-byte marshalled struct (4 × i16 padded to i32 boundaries with Pack=4) — the OUTPUT of synthesis, not a lookup entry. 3. `LmxProxy.dll` Fire_* event handlers receive already-populated `MXSTATUS_PROXY[]` and forward through ATL dispatch — no synthesis. 4. `LmxProxy.dll` Fire_* CALLERS (FUN_1001657f / FUN_10016b50 / FUN_10016d4b) call FUN_10003f60(out_safearray, in_status_ptr, count=1) which is a VERBATIM memcpy of an existing 14-byte buffer into the SAFEARRAY — no transformation. 5. `Lmx.dll`'s `PreboundReference::OnSetAttributeResult` (FUN_10114a90) receives an already-populated `short *param_7` status buffer. Log line confirms the layout: `<success %d category %d detectedBy %d detail %d>`. Dispatches on typed values — synthesis is upstream of this function too. The synthesizer is the NMX-frame decoder in Lmx.dll that calls OnSetAttributeResult / OnGetAttributeResult / equivalent OperationComplete handler. The decoder takes raw NMX bytes plus operation context (item handle, engine state, retry state, correlation id) and computes the populated MXSTATUS_PROXY. There is NO static lookup table — synthesis is per-message contextual. Two viable paths to typed promotion (both substantial; neither a small codec patch): - Path A: port the synthesizer. ~1-2 weeks. Requires extending the Rust session to track per-operation context (handles, retries, correlation ids). Out of V1 scope. - Path B: empirical capture pairs. ~30 min × 6-10 scenarios. Output is a (byte, context → status) mapping that approximates without re-implementing. Risk: mapping is only valid for captured contexts. R3/R4 stay settled at verbatim-preserve. The .NET reference does the same for the same reason: the synthesizer is too context- dependent to mirror without porting the entire operation-tracking state machine. Reopen criteria sharpened: either (a) a consumer files a concrete use case for typed promotion of a specific byte+context combination (Path B's empirical capture for that one combination is the cheapest answer); or (b) a major-version bump justifies the state-machine port (Path A). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
4dfc0cee65 |
[R3 + R4 + R8] settle protocol-level risks per Ghidra evidence
Ghidra headless decompile of `Lmx.dll`'s `aaDCT` symbol + the `LmxProxy.dll` Fire_* event handlers (logs at `analysis/ghidra/exports/Lmx.dll.aadct-decompile.md` and `analysis/ghidra/exports/LmxProxy.dll.completion-status-decompile.md`) settles **R3** and **R4** as "no static byte→status lookup table exists": - `Lmx.aaDCT` at `0x10178fc0` is a `SysAllocString(L"Lmx.aaDCT")` into a global BSTR — a logging category name, not a table. - `MXSTATUS_PROXY` is a 4-field struct (success/category/detectedBy/ detail), used as the marshalled COM event payload — not a static array of pre-mapped statuses. - `Fire_OnDataChange` / `Fire_OnWriteComplete` / `Fire_OperationComplete` / `Fire_OnBufferedDataChange` (RVAs 0x15f72, 0x1611f, 0x16271, 0x163c0 in `LmxProxy.dll`) receive ALREADY-POPULATED `MXSTATUS_PROXY[]` arrays — the byte-to-struct synthesis happens upstream in the proxy's NMX-callback ingestion code, not via a table lookup. The synthesis is per-event computation from operation context (engine ids, item handles, retry counters), not a static promotion. R3/R4 status updated from "indefinitely deferred — no Ghidra table" to "settled — no table exists; verbatim preservation is the canonical answer." The .NET reference's `NmxOperationStatusMessage.TryParseInner` + the Rust port's `mxaccess-codec/src/operation_status.rs` already match this canonical behaviour; no code change required. Reopen R3/R4 only if a context-aware capture surfaces a per-byte synthesis logic that depends on operation context — at which point the codec would need access to the originating operation's context, which is upstream of the bytes themselves. **R8** marked permanently deferred — implementation already parses NTLM AV pairs per [MS-NLMP] §2.2.2.1 (including the cross-domain shapes `MsvAvDnsTreeName` / `MsvAvDnsComputerName` carrying the trusted-domain DNS suffix), what's missing is the live capture, and the live capture requires a multi-domain Windows lab not available on this dev host. Same status pattern as F3 in `design/followups.md`. Open evidence gaps table updated to reflect: - Cross-domain NTLM: deferred (R8) - Ghidra mapping table for completion-only bytes: no table exists (R3/R4 settled) - Activate/Suspend transition (wire): partial (F44 + F46), live re-run pending (F50) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
0e93e3a8fa |
design/followups: file F48-F53 for known V1 residuals
After the M6 closeout sweep (F35-F47 all resolved), six residuals remain that were either documented inline in design docs or implied by closure language but not formally tracked as F-numbers. File them explicitly so the next iteration has a clean starting list: - F48: Execute `cargo publish` for V1 (F43 was dry-run only). Documents the 9-crate dependency-ordered publish sequence + the version bump from 0.0.0 placeholder to 0.1.0. - F49: Live verification sweep for F36 (buffered subscribe) + F45 (recovery replay) + F47 (unsubscribe skip) + F40 (metrics). Closes the gap where these features ship with unit tests but weren't live-exercised against AVEVA in the closing iteration. - F50: Run the F46 Suspend/Activate Frida capture live (script ready, capture deferred to maintainer-side). - F51: Live type-matrix expansion for `asb-subscribe` — Bool / Float / Double / String / DateTime / Duration / arrays. F32 was closed via "deployable maximum" but the codec supports more types than the live matrix exercises. - F52: Codec performance optimisations from F39 (BytesMut output buffer, name-signature cache, session scratch pool). Documented as post-V1 in M6-bench-baseline.md; filing them as F-numbers so the alloc-count deltas are tracked when they land. - F53: Enable `#![warn(missing_docs)]` workspace-wide. Deferred from F42 — the lint surfaces hundreds of low-priority gaps that need a dedicated pass. R3, R4, R8, F3 already in their respective tracking docs (the risks register + the Open section's permanently-external-blocked entry). Open section now contains: F3 (permanently external-blocked) + F48-F53 (V1-residual triage). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
25befcb72e |
design/followups: move F45 + F47 to Resolved (M6 + spawned closures)
F45 (commit |
||
|
|
1a1830f3bf |
[F47] mxaccess: unsubscribe skips UnAdvise for buffered subscriptions
Mirrors the .NET reference's `if (!subscription.IsBuffered)` guard
at `MxNativeSession.cs:361-381`. The Rust port previously emitted an
`UnAdvise` frame for both plain and buffered subscriptions; the
buffered server-side registration is unwound by the engine when the
`RegisterReference` handle goes away, so emitting an `UnAdvise` for
buffered entries is at best a no-op extra frame and at worst could
race with the engine's own teardown.
Fix: branch `Session::unsubscribe` on `SubscriptionEntry::mode` (the
discriminator F45 added). For `SubscriptionMode::Buffered { ... }`,
skip the `un_advise` call and proceed directly to registry cleanup.
For `SubscriptionMode::Plain`, retain the previous behaviour.
The registry-entry probe runs first (separate lock acquisition) so
the `is_buffered` decision doesn't hold the NMX-client mutex
unnecessarily — common case where the entry is plain still acquires
the NMX lock immediately after.
The metrics counter `record_unadvise()` still fires on every public
`unsubscribe` call regardless of mode — it tracks consumer-side
unsubscribe rate, not wire-frame rate. That matches what dashboards
expect from the public API.
New unit test `unsubscribe_skips_un_advise_for_buffered_subscription`
issues a plain subscribe (recorded as 1 RPC), mutates the registry
entry to `SubscriptionMode::Buffered`, calls unsubscribe, and
asserts the recorded RPC count stays at 1 (no UnAdvise emitted).
The existing `subscribe_populates_registry_unsubscribe_clears_it`
test serves as the negative control for the plain branch.
Workspace 794 → 795 tests; clippy clean; rustdoc clean.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
|
|
9b57cf8f3b |
[F45] mxaccess: recovery replay re-issues RegisterReference for buffered subs
`Session::recover_connection_core` previously walked
`SessionInner::subscriptions` and replayed every entry via
`AdviseSupervisory`, which lost the `.property(buffer)` registration
on buffered subscriptions — silently downgrading buffered → plain on
transport rebuild.
Fix:
- New `pub(crate) enum SubscriptionMode { Plain, Buffered { ... } }`
discriminator carried on each `SubscriptionEntry`. Buffered variant
retains the un-suffixed reference + the rounded interval (so the
re-issued buffered registration matches the original cadence) +
the empty `item_context` / zero `item_handle` matching the wire
send.
- `Session::subscribe` (plain path) records `SubscriptionMode::Plain`.
`subscribe_buffered_nmx` records `SubscriptionMode::Buffered { ... }`.
- `recover_connection_core` matches on `entry.mode`. Plain branch
unchanged. Buffered branch re-applies `.property(buffer)` via
`to_buffered_item_definition` (idempotent), rebuilds the original
`NmxReferenceRegistrationMessage` with the saved correlation id +
`subscribe = true`, and dispatches `register_reference` (kind=
ItemControl, inner command 0x10) against the replacement
transport. Mirrors `MxNativeSession.ReAdviseSubscription`
(`MxNativeSession.cs:538-569`).
New unit test `recover_connection_replays_buffered_subscription_via_
register_reference` synthesises a buffered registry entry, installs a
`RebuildFactory` pointing at a recording NMX server, drives
`recover_connection`, then asserts the recorded `TransferData` carries
inner command `0x10` (NOT `0x1f`) with the `.property(buffer)`-
suffixed item_definition + the saved correlation id + subscribe=true.
Side-finding worth filing separately: `Session::unsubscribe`
unconditionally calls `un_advise` for both plain and buffered
entries, but the .NET reference's `Unsubscribe`
(`MxNativeSession.cs:361-381`) skips `UnAdvise` for buffered
(`if (!subscription.IsBuffered)`). Out of scope for F45 (recovery-
only); will file as F47.
Public API unchanged. `SubscriptionMode` + `SubscriptionEntry` stay
`pub(crate)` — `cargo public-api -p mxaccess` baseline is unchanged.
Workspace 793 → 794 tests; clippy clean; rustdoc clean.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
|
|
2281309a86 | design/followups: move F46 to Resolved (Frida hooks landed) | ||
|
|
808fea18a0 |
[F46] analysis/frida: Suspend/Activate hooks + R5 next-step
Closes the wire-side gap left by capture 077 in F44's R5 walk. The Frida
script now hooks the production LmxProxy.dll dispatchers so a future live
re-run on the AVEVA host can answer "does CLMXProxyServer issue a separate
ORPC method for Suspend/Activate, or are they synthesised client-side?"
Hooks added in `analysis/frida/mx-nmx-trace.js`:
- `LmxProxy.dll!CLMXProxyServer.Suspend` @ RVA 0x13d9c (FUN_10013d9c)
- `LmxProxy.dll!CLMXProxyServer.Activate` @ RVA 0x14028 (FUN_10014028)
Both RVAs were extracted from
`analysis/ghidra/exports/LmxProxy.dll.string-refs.tsv` rows 119/122 (the
`CLMXProxyServer::Suspend - Server Handle` / `Activate - Server Handle`
log strings each xref one function — same pattern as the existing
AdviseSupervisory hook at 0x142b4). The hooks emit `mx.suspend.begin/end`
and `mx.activate.begin/end` events with serverHandle, itemHandle, and the
`MxStatus*` out parameter decoded as 4 x int16 (Success / Category /
DetectedBy / Detail per `src/MxNativeCodec/MxStatus.cs`). Naming matches
the F46 spec's `mx.<verb>.begin / end` grep convention rather than the
generic `call.enter / leave` shape because we want to filter these out
of large traces without false positives from other LmxProxy entrypoints.
No `Resume` / `Reactivate` exports exist in `LmxProxy.dll` — verified
against `analysis/ghidra/exports/LmxProxy.dll.ghidra.md` (no such string
xrefs) and the decompiled `ILMXProxyServer5` / `ILMXProxyServer4`
interfaces under `analysis/decompiled-mxaccess/ArchestrA/MxAccess/`
(only Suspend and Activate are declared on the dispatch interface).
The script's top-of-file comment now carries the live re-run procedure
(rebuild MxTraceHarness x86, attach Frida with `--scenario=suspend-advised`
then `--scenario=activate-advised`, save under
`captures/NNN-frida-suspend-activate-instrumented/`, grep the new TSV for
`mx.suspend.*` / `mx.activate.*` and correlate with `nmx.enter` events
in the same time window). Live capture is intentionally deferred to the
maintainer per the F46 spec — this dev box has no AVEVA install.
`design/70-risks-and-open-questions.md` R5 status updated:
- Title flag `(filed as F45)` -> `(filed as F46, hook landed pending live re-run)`
(the docs/M6-buffered-evidence.md footnote referenced F45 from before
F45 / F46 were de-conflicted by commit
|
||
|
|
c7e71e4424 |
design/followups: move F41 + F43 to Resolved (M6 complete)
All 10 M6 sub-followups (F35-F44 minus the ones absorbed into F44) plus F41 + F43 are now resolved. Open section narrows to: - F45: buffered recovery replay (sub-followup of F36) - F46: Suspend/Activate wire emission (sub-followup of F44) - F3: cross-domain NTLM fixture (permanently external-blocked) M6 closeout: see CHANGELOG.md for the V1 release notes. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
7b15c853d1 |
[F43] release prep: CHANGELOG + cargo publish --dry-run validation
V1 release prep per M6 DoD bullet 6: **`CHANGELOG.md`** — V1 release notes covering all 9 workspace crates (`mxaccess-codec`, `mxaccess-rpc`, `mxaccess-asb-nettcp`, `mxaccess-asb`, `mxaccess-galaxy`, `mxaccess-callback`, `mxaccess-nmx`, `mxaccess`, `mxaccess-compat`), the M0 → M6 milestone closeouts, deliberate divergences from the .NET reference (multi-record DataUpdate codec relaxation per F44; buffered single- sample stream per R2), and known limitations (F3 cross-domain NTLM, F45 buffered recovery replay, F46 Suspend/Activate wire instrumentation, R3/R4 OperationComplete trigger). Documents the dependency-ordered publish sequence (leaf crates first; dependent crates require their deps to exist on crates.io before their dry-runs can run). **`cargo publish --dry-run` validation:** - Leaf crates (mxaccess-codec, mxaccess-rpc, mxaccess-asb-nettcp): dry-run passes — tarball builds, metadata complete, license/ description/repository/rust-version all present via `workspace.package`. - Dependent crates (mxaccess-asb, mxaccess-galaxy, mxaccess-callback, mxaccess-nmx, mxaccess, mxaccess-compat): dry-run fails with "no matching package" against crates.io — expected behaviour, the registry lookup happens even with `--no-verify`. Validation of these crates falls to the build-test-clippy-public_api matrix rather than dry-run. `design/followups.md`: F43 moved to Resolved with a verdict pointing at this commit + the CHANGELOG. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
f0c9dd2214 | rust: add version specifiers to workspace path deps for cargo publish | ||
|
|
9e57bfd451 |
[F41 + F44 reconciliation] cargo public-api baselines + multi-record DataUpdate codec
**F41 — public-api baselines (M6 DoD bullet 5)**
`design/public-api/{crate}.txt` for all 9 workspace crates, generated
via `cargo +nightly public-api --simplified -p <crate>`. Per-crate
baseline sizes:
- mxaccess-codec: 2516 lines
- mxaccess-asb: 1258 lines
- mxaccess-rpc: 1273 lines
- mxaccess-asb-nettcp: 708 lines
- mxaccess: 542 lines
- mxaccess-galaxy: 374 lines
- mxaccess-callback: 170 lines
- mxaccess-compat: 123 lines
- mxaccess-nmx: 118 lines
`design/public-api/README.md` documents the update procedure
(install nightly + cargo-public-api, regenerate the affected baseline
on intentional API changes, commit alongside).
`.github/workflows/rust.yml` gains a `public-api` job that runs the
same diff against the committed baseline; drift fails CI with a
unified diff in the log so the PR author can either revert or
update the baseline.
**F44 reconciliation — multi-record DataUpdate codec**
Cherry-picked from the F44 sub-agent's worktree (commit `aec6a0c`):
`subscription_message.rs::parse_data_update` now loops over
`record_count` like `parse_subscription_status` does, accepting any
positive count. The .NET reference still hard-throws on
`record_count != 1`; the Rust codec deliberately diverges per the F44
evidence walk against `captures/094-frida-buffered-separate-writer/
frida-events.tsv:145` (a `0x33` DataUpdate body with `record_count = 2`,
inner_length = 23 (preamble) + 2 * 19 (records) = 61, post a
separate-session writer triggering two value changes inside one
`SetBufferedUpdateInterval(1000)` window).
Two new round-trip tests:
- `data_update_multi_record_round_trip` — synthesises a 2-record body,
parses, asserts both records decode to expected Int32 values.
- `data_update_capture_094_truncated_record_errors` — truncates the
capture-094 fixture mid-second-record, asserts CodecError::Decode.
New wire-byte fixtures under `crates/mxaccess-codec/tests/fixtures/m6-buffered/`:
- `094-line145-dataupdate-recordcount2.bin` (57 bytes, `0x33` multi-record)
- `094-line48-substatus-recordcount2.bin` (101 bytes, `0x32` multi-record)
R2 in `design/70-risks-and-open-questions.md` updated from
"single-sample (settled silently)" to "settled per option (a) — codec
relaxed; multi-record observed in production-stack tracing."
`design/followups.md`: F44's verdict updated to reflect the
contradiction-then-relaxation, with reference to the new tests +
fixtures.
Workspace 792 → 794 tests pass; clippy clean; rustdoc clean.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
|
|
2120dfa965 |
design/followups: move F35/F40/F44 to Resolved + de-conflict F45/F46
rust / build / test / clippy / fmt (push) Has been cancelled
After commits |
||
|
|
ad1cf2351c |
[F36 + F40 + F44] M6 wave 1: subscribe_buffered (NMX) + metrics + evidence
Three M6 sub-followups landed in this wave (sub-agent worktrees +
manual reconciliation in main):
**F36 — Session::subscribe_buffered (NMX) per R2 single-sample**
- `BufferedOptions::rounded_update_interval_ms()` — 100ms rounding
helper mirroring MxNativeCompatibilityServer.cs:638
((updateInterval + 99) / 100) * 100, saturating on overflow.
- `Session::subscribe_buffered` (public, lib.rs:604) delegates to
the new private `subscribe_buffered_nmx` which uses the buffered
RegisterReference path: item_definition suffixed with
`.property(buffer)`, subscribe=true (no separate
AdviseSupervisory follow-up — verified against capture 082).
- Per R2 verified at wwtools/mxaccesscli/docs/api-notes.md the wire
semantic is single-sample-per-event with a server-side cadence
knob; rounded_ms is held client-side only (native MXAccess does
not emit a separate SetBufferedUpdateInterval RPC, verified by
absence in 079/082 captures).
- New crates/mxaccess/examples/subscribe-buffered.rs.
- New crates/mxaccess-codec/tests/buffered_register_reference_parity.rs:
4 tests (capture 079/082 round-trip, suffix helper, constructive
forward-build vs capture 082).
**F40 — Optional metrics feature**
- New crates/mxaccess/src/metrics.rs (275 lines): `pub(crate)`
thin wrappers (`record_write_latency`, `record_read_latency`,
`inc_writes`, `inc_reads`, `inc_advises`, `inc_recovery_*`,
`set_active_subscriptions`, etc.) that compile to no-ops under
`#[cfg(not(feature = "metrics"))]`. Call sites in session.rs +
asb_session.rs invoke them unconditionally; the gate is inside
the wrapper.
- `metrics = { version = "0.24", optional = true }` added to
workspace + mxaccess crate Cargo.toml.
- Default build: zero metrics dep, zero runtime cost.
**F44 — Buffered batch + suspend capture decode evidence**
- New docs/M6-buffered-evidence.md: per-capture summary for
077, 079, 080, 081, 082, 094 — call sequence, key wire bytes,
R2/R5 verdict.
- R2 confirmed silently as "not a real risk" — single-sample
observed across 079/080/082/094.
- R5 trigger conditions documented from capture 077: AdviseSupervisory
+ Suspend pair, 1-second intervals, succeeds on enum attributes.
- design/70-risks-and-open-questions.md R2/R5 status updated.
Workspace: 759 → 792 tests, clippy clean, rustdoc -D warnings clean.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
|
|
d5aa152b1f |
[F35] mxaccess-compat: LMXProxyServer-shaped facade (18 methods)
Replace the 8-line `mxaccess-compat` stub with a real `LmxClient` struct exposing the 18 `ILMXProxyServer5` methods as Rust async fns on top of `mxaccess::Session` (NMX) and `mxaccess::AsbSession` (ASB). Handle-table approach * `Mutex<HashMap<i32, ItemRef>>` for item handles, populated by `add_item` / `add_item_2` / `add_buffered_item`, drained by `remove_item` / `unregister`. * `Mutex<HashMap<i32, UserRef>>` for user handles allocated by `authenticate_user` / `archestra_user_to_id`. * `AtomicI32` monotonic counters for both, matching the .NET reference's `_nextItemHandle` / `_nextUserHandles` per `MxNativeCompatibilityServer.cs:62-63`. Stream-based event surface (per Q4) * `OnDataChange` / `OnBufferedDataChange` / `OnWriteComplete` / `OperationComplete` exposed as `EventStream<T>: Stream<Item=T>`, backed by `tokio::sync::broadcast` channels. Lag silently skips past `BroadcastStream::Lagged` to keep the public `Item` shape ergonomic. NOT COM events — that's the post-V1 `mxaccess-compat-com` crate per design/70-risks-and-open-questions.md Q4. The `OperationComplete` channel is wired but no firing path is modelled (R3 deferred — no captured byte mapping yet). * `Advise` / `AdviseSupervisory` spawn a background fan-out task that drains the `Subscription` stream and routes each `DataChange` to either `on_data_change` or `on_buffered_data_change` based on the item's `is_buffered` flag. `UnAdvise` / `RemoveItem` abort the task. Pass-through methods * `Write` / `Write2` -> `Session::write` / `write_with_timestamp` (`userId` ignored — the underlying surface uses engine identity). * `WriteSecured2` -> `Session::write_secured_at` with both user ids always passed (R6: single-user secured = same id twice; never gated). * `AdviseSupervisory` collapses onto `Session::subscribe` because the wire path is `AdviseSupervisory` already (`session.rs:1057`), matching the .NET reference's `cs:251-259` identical collapse. * `SetBufferedUpdateInterval` rounds up to nearest 100 ms per `MxNativeCompatibilityServer.cs:638`. Stubbed pass-throughs (mirror upstream `Error::Unsupported`) * `WriteSecured` (no timestamp) — `Session::write_secured` is stubbed at `crates/mxaccess/src/lib.rs:472` (only `WriteSecured2`/`0x3A` is ported); workaround documented inline. * `AddBufferedItem` allocates the handle but `Advise` for buffered items does not yet drive `Session::subscribe_buffered` cadence knob — TODO(F36) flagged inline at `add_buffered_item` and `set_buffered_update_interval`. Tests (25 new, all green) * Handle-table lifecycle: Add -> Advise -> UnAdvise -> Remove with a mocked subscription task. * Monotonic handle allocation; context-prefix combination. * `SetBufferedUpdateInterval` rounding (50 -> 100, 101 -> 200, etc.) + zero-rejection. * Compile-time check that all 18 LMX methods are reachable on `LmxClient`. * Each event stream yields published items; lag silently dropped. * GUID-shape validation; server-handle mismatch errors. Build hygiene * `cargo build -p mxaccess-compat` clean. * `cargo test -p mxaccess-compat` -> 25 passed. * `cargo clippy -p mxaccess-compat --all-targets -- -D warnings` clean. * `RUSTDOCFLAGS=-D warnings cargo doc -p mxaccess-compat --no-deps` clean. Deferred / TODOs * TODO(F36): wire `set_buffered_update_interval` cadence into the `advise` path for buffered items. * TODO(R3): plumb a real trigger into `on_operation_complete` once the byte mapping lands. * TODO(wave 2): live integration tests against AVEVA. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
a1c4c6203e |
design/followups: move F37/F38/F39/F42 to Resolved
rust / build / test / clippy / fmt (push) Has been cancelled
Four M6 sub-followups closed in this session — moved to Resolved section with concise verdicts referencing the matching commits: - F37 (commit |
||
|
|
71c69b80c6 |
[F38] mxaccess-codec: counting-allocator bench harness + R12 baseline
Hand-rolled GlobalAlloc wrapper around System that tracks allocs +
bytes + deallocs via two atomics. Each scenario runs 10k iterations
after a 1k warm-up; output is a markdown table with allocs/op,
bytes/op, deallocs/op.
Why hand-rolled (not dhat/criterion): R12 gates on a single number
("< 5 allocs/write"). dhat is heap-profiling-oriented (call-stack
attribution, JSON snapshots); criterion measures wall-clock latency
which is reported-but-not-gated per 60-roadmap.md:104. A 50-line
GlobalAlloc + atomic counters is the simplest thing that answers
the gate.
Run: `cargo bench -p mxaccess-codec`
Baseline numbers (release, Windows x64):
- Bool write: 1.00 allocs/op
- Int32 write: 2.00 allocs/op
- Float32 write: 2.00 allocs/op
- Float64 write: 2.00 allocs/op
- String write: 4.00 allocs/op (5-char string)
- Handle from_names: 2.00 allocs/op
- DataUpdate decode: 1.00 alloc/op
R12's < 5 allocs/write target is **already met** across the proven
matrix without any zero-copy work. The bench gates on this — any
write_message::encode scenario at >= 5 allocs/op exits the harness
with code 1.
Companion: `design/M6-bench-baseline.md` documents the numbers,
explains the per-scenario breakdown, and tightens F39's scope from
"hit the target" to "nice-to-have optimisations" (BytesMut output
buffer, name-signature cache, session-level scratch pool).
Workspace: 759 tests still pass; clippy --benches clean.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
|
|
e79e289743 |
[F42] cargo doc --workspace --no-deps clean (0 warnings)
Fix all 33 rustdoc warnings across the workspace: - Unresolved intra-doc links: rewrite [`name`] → either backtick text (when not actually a link) or fully-qualified `[Type::method]` / `[crate::module::name]` form. Affected: mxaccess-codec (asb_variant, item_control, metadata_query, observed_write_template, reference_handle, write_message), mxaccess-rpc (pdu), mxaccess-nmx (client), mxaccess-asb-nettcp (nmf), mxaccess-callback (exporter), mxaccess (asb_session, session, lib). - Bracket-text being interpreted as link refs (e.g. `body[17]` → `` `body[17]` ``). - Private-item references in public docs (CALLBACK_BROADCAST_CAPACITY, recover_connection_core, mxvalue_to_writevalue) reduced to backtick-text since they aren't part of the public API. `RUSTDOCFLAGS="-D warnings" cargo doc --workspace --no-deps` now exits clean. Workspace 759 tests pass; clippy clean. Defers `#![warn(missing_docs)]` lint to a future pass — the cleanup target is the broken-link warnings, which are signal; missing-docs would surface hundreds of low-priority public-item gaps that are out of scope for this F-number. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
34045c2f6d |
[F37] mxaccess: AsbSession::subscribe_buffered returns Unsupported
ASB has no `SetBufferedUpdateInterval` analogue — the per-monitored-
item `MinimalMonitoredItem::sample_interval` plays the cadence-knob
role. Calling `subscribe_buffered` on an ASB session now returns
`Error::Unsupported { transport: TransportKind::Asb, operation: ... }`
synchronously, without touching the wire.
The error-construction logic is split into a free fn
`unsupported_subscribe_buffered_error()` so the gate's exact shape
is unit-testable without spinning up a live authenticator + transport
fake. New unit test asserts both the variant tag and that the
operation message names the unsupported method + hints at the
`sample_interval` analogue.
Workspace 758 → 759 tests, clippy clean.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
|
|
2546710604 |
design/followups: add F35-F44 for M6 implementation plan
10 new followups decompose M6 (compatibility shim + production hardening) into parallel-safe sub-streams: - F35: mxaccess-compat LMXProxyServer-shaped facade (18 methods over Session/AsbSession) - F36: Session::subscribe_buffered NMX path per R2 single-sample - F37: ASB subscribe_buffered capability gate - F38: counting-allocator cargo bench harness for R12 target - F39: zero-copy codec pass (depends on F38) - F40: optional metrics feature - F41: cargo public-api baseline (depends on F35/F36/F37/F39/F40) - F42: cargo doc cleanup pass - F43: cargo publish --dry-run all crates (depends on F41) - F44: decode buffered batch + suspend captures (077, 079-082, 094) for R2/R5 evidence Parallelization: Wave 1 = F35/F36/F37/F38/F40/F42/F44 (different crates / different concerns); Wave 2 = F39 (needs F38's bench); Wave 3 = F41 (needs API stable); Wave 4 = F43 (release). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
bedad57b4e |
design/followups: move F18 (M5 meta-tracker) to Resolved
rust / build / test / clippy / fmt (push) Has been cancelled
Trim the planning content (sub-stream breakdown table, parallel-safety analysis, risk-driven sequencing, "Resolves when" gate) since M5 is done. Keep the closure verdict, M5 DoD checklist showing the actual state at close, sub-followup closeout list (F19-F26 + F28/F29/F30/ F31/F32/F33/F34), cumulative execution log, and the architectural note explaining why AsbSession stays parallel to the NMX Session rather than unified — that's load-bearing for future maintenance. Open section now contains only F3 (cross-domain NTLM Type1/2/3 fixture, permanently external-blocked on this single-domain dev host — resolution requires multi-domain Windows lab not available here). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
b1a5f5ff1e |
design/followups: move F34 to Resolved (live-verified closure)
The F34 entry's body had grown into a debugging notebook with five
"Open hypotheses" and a "Resolves when" speculation block — all of
which are now moot since the actual fix landed. Trim to the closure
verdict, the technical evidence (captured fixture dictionary, the
dual-format insight), and the bonus context discovered while
debugging (Active/SampleInterval/result_code 32 quirks). Move from
"## Open" to "## Resolved" with date + commit
|
||
|
|
101a8b13f5 |
[F34] mxaccess-asb: AddMonitoredItems body uses DataContract field names
Rewrite push_monitored_item_body to emit the DataContract field-suffix names from AsbContracts.cs:940-965 (activeField, bufferedField, itemField, sampleIntervalField, timeDeadbandField, userDataField, valueDeadbandField) under prefix `b` bound to the http://schemas.datacontract.org/2004/07/ArchestrAServices.ASBIDataV2Contract namespace. The <Items> wrapper now declares xmlns:b + xmlns:i. The legacy XmlSerializer property names (<Active>, <Item>, <SampleInterval>, <Buffered>) only matter for the canonical-XML HMAC signing input — that emitter at xml_canonical::emit_monitored_item is unchanged and F28 fixture byte-equality still holds for all 13 ops. On the binary NBFX wire MxDataProvider's DataContractSerializer expects the field-suffix form. Wire-byte type encoding matches the captured fixture (add-monitored-items-request-wire.bin): bool → Bool record, ulong → Zero/One/Chars (XmlConvert decimal text), ushort → Zero/One/Int8/Int16/Int32 (smallest-fit binary). Empty string? + null byte[]? emit as empty elements with no <i:nil> attribute (matching the wire). Field order follows the explicit [DataMember(Order = N)] sequence. Adjacent: ItemIdentity is nested via DataContract field names too — NOT the binary <ASBIData> fast-path, which only kicks in at top-level message body members. Verified live against AVEVA MxDataProvider: AddMonitoredItems now returns 1 status item with error_code=0x0000 (previously 0 items; the silent failure was the deliberate DC-schema mismatch); Publish poll #4 delivers the actual tag value as AsbVariant { type_id: 4, length: 4, payload: [99,0,0,0] } through the F26 stream. Pre-existing clippy::format_collect errors in auth.rs:339,342 and client.rs:952 fixed in passing — they were blocking workspace clippy otherwise. Workspace: 757 → 758 tests, clippy -D warnings clean. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
6762526f09 |
design/followups: mark F18 (M5 meta-tracker) resolved
All sub-followups F19-F26 closed; M5 is functionally LIVE end-to-end (asb-subscribe returns the real tag value over the wire). The structural-port followups F18 spawned (F2/F10/F11/F27 for NTLM / DCOM / RemUnknown / DH) all resolved separately. F18 stays under "## Open" as the cumulative-execution-log anchor; status line at the top now reflects the closed state so the open/resolved structure matches reality. Remaining open items: F34 (MonitoredItem wire format, P2 — needs nbfx auto-intern fix + DataContract field-suffix body builders) and F3 (cross-domain NTLM fixture, P2 — permanently external-blocked on this single-domain dev host). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
1de049e114 |
[F2] mxaccess-rpc: NTLM verify_signature (server-to-client) with constant-time MAC compare
rust / build / test / clippy / fmt (push) Has been cancelled
Closes F2. Structural port from [MS-NLMP] §3.4.4 — same shape as
the existing sign path but uses the server-to-client sub-keys
(`SealKey_S→C` / `SignKey_S→C`) derived alongside the client-to-
server pair at the end of create_type3.
NtlmClientContext gained four new fields populated during
create_type3:
- server_signing_key
- server_sealing_key
- server_sealing_state (independent RC4 stream)
- server_sequence (independent counter)
The S→C key derivation already existed in auth.rs (the seal_key /
sign_key helpers take a client_mode flag); F2 plumbs them into a
new verify_signature(message, signature) method.
The verify path:
1. Validates signature.len() == 16 + leading version word 0x01.
2. Reads trailing seq num, compares against self.server_sequence
(mismatch ⇒ InvalidSignature, no state change).
3. Computes expected_mac = HMAC_MD5(server_signing_key,
seq || message)[0..8] then RC4 transform.
4. Constant-time compares expected_mac against wire bytes 4..12
via subtle::ConstantTimeEq.
5. On success: commits cipher-state advance + ++server_sequence.
On failure: re-derives RC4 from server_sealing_key and skips
past server_sequence × 8 keystream bytes to restore the
pre-verify position — caller can retry.
New dep `subtle = "2"` (workspace-internal to mxaccess-rpc) for
the timing-oracle-safe MAC compare.
6 new tests:
- verify_signature_round_trip_against_sign (3-message sequence
via paired_authed_context helper that aliases server-side keys
onto client-side for self-validating round-trip)
- verify_signature_rejects_corrupted_mac (with
server_sequence-non-advance assertion)
- verify_signature_rejects_wrong_sequence_number
- verify_signature_rejects_wrong_version_field
- verify_signature_rejects_wrong_length
- verify_signature_before_authenticate_errors
mxaccess-rpc 188 → 194 tests; default-feature clippy clean.
The "awaiting wire-fixture capture" step listed in F2's prior
status note is no longer a hard prerequisite — [MS-NLMP] §3.4.4
fully defines the algorithm and the round-trip tests prove the
encoder/decoder pair is internally consistent. A captured
StatusReceived frame would still validate byte-parity vs a real
NmxSvc.exe signer, but that's future verification work; the
structural port ships unblocked.
design/followups.md F2 moved to Resolved.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
|
|
161b0cedfa |
[F10 + F11] mxaccess-rpc: structural ports for ResolveOxid2 + RemAddRef/RemRelease
rust / build / test / clippy / fmt (push) Has been cancelled
Closes F10 and F11 per option (b) of each followup's resolve
criterion: hand-rolled body codecs derived from the [MS-DCOM]
spec, ship structurally with no live fixture (the .NET reference
doesn't call these opnums), validate against captured frames when
they become available.
F10 — IObjectExporter::ResolveOxid2 (opnum 4):
Per [MS-DCOM] §3.1.2.5.1.4. New parse_resolve_oxid2_result
mirrors parse_resolve_oxid_result exactly except for the extra
4-byte COMVERSION slot (u16 major + u16 minor) between
authn_hint and error_status. Trailing-fields check tightens
from 24 bytes (opnum 0) to 28 bytes (opnum 4). New ComVersion +
ResolveOxid2Result types. referent_id=0 short-circuit returns
empty bindings + default ComVersion + tail-status, matching
opnum 0's pattern.
F11 — IRemUnknown::RemAddRef + RemRelease (opnums 4 and 5):
Per [MS-DCOM] §3.1.1.5.6 + §2.2.19 (REMINTERFACEREF). Both
opnums share the wire shape, so:
- encode_rem_add_ref_request + encode_rem_release_request
both delegate to a shared encode_remref_array_request
helper.
- parse_remref_response is shared between them too — they
have an identical OrpcThat + referent_id + max_count +
N×HRESULT + error_code layout.
New RemInterfaceRef struct (ipid + public_refs + private_refs,
ENCODED_LEN = 24) + RemRefResponse decoded shape.
8 new structural tests across both modules pin every documented
edge of each shape — short stubs, referent-zero short-circuits,
truncated-trailing detection, full multi-element round-trips.
mxaccess-rpc 183 → 188 tests; default-feature clippy clean.
Both followups documented "**Status:** Awaiting wire-fixture
capture" prior to this commit; the structural-port option was
explicitly listed as resolution path (b) in each. Future captured
frames will validate the byte-by-byte match — until then the
port is byte-faithful to the spec but unverified against a live
peer (which is fine for shipping since neither opnum is on the
NMX session's hot path).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|