04c10babfb2ccc6d07a1fcc71d190aa9a6bf522a
84 Commits
| Author | SHA1 | Message | Date | |
|---|---|---|---|---|
|
|
4ff511bbed |
[F54] per-operation correlation + compat OnWriteComplete fan-out
Closes the residual that R3/R4 Path A's commit `c73a33e` deferred:
the OperationStatus.context field was always None because no
in-flight correlation map existed in SessionInner, and the
mxaccess-compat broadcast channels for OnWriteComplete /
OperationComplete were exposed on the public API but had no
fan-out task draining session events into them.
**mxaccess (Part 1 — per-operation correlation):**
- New `pending_ops: Mutex<HashMap<[u8; 16], OperationContext>>` on
SessionInner. Populated when `Session::write*` / `subscribe*`
dispatches an outstanding operation; entry removed when the
matching OperationStatus event fires (one-shot semantics).
- New `Session::write_with_handle` (and equivalents for the secured /
timestamped paths) returns a `WriteHandle { correlation_id }` so
consumers can correlate completions back to their originating
call. Existing `write` / `write_value` / etc. signatures unchanged
and delegate to the handle-returning variant.
- Callback router extended to look up `pending_ops` by correlation_id
on each operation-status event. When found, populates
`OperationStatus.context: Some(OperationContext { correlation_id,
op_kind, reference, retry_count: 0 })`. When not found, falls
through with `context: None` (verbatim-preserve per CLAUDE.md).
- New unit tests assert: matching correlation_id populates context,
unknown correlation_id leaves context None, the entry is removed
from `pending_ops` after one event fires.
**mxaccess-compat (Part 2 — compat-layer fan-out):**
- New `correlation_to_item: tokio::sync::Mutex<HashMap<[u8; 16], i32>>`
on LmxClientInner.
- `LmxClient::write` / `write_2` / `write_secured` / `write_secured_2`
call `Session::write_with_handle` (or equivalent) and insert
`correlation_id → item_handle` into the map before returning.
- `LmxClient::register` / `register_asb` spawn a background task that
drains `session.operation_status_stream()`. Per event, looks up
`correlation_to_item[event.context?.correlation_id]` to find the
item_handle, then routes:
- `OperationKind::Write` / `OperationKind::WriteSecured` →
`WriteCompleteEvent { server_handle, item_handle, statuses,
is_during_recovery }` into `on_write_complete_tx`.
- Other variants → `OperationCompleteEvent { ... }` into
`on_operation_complete_tx`.
- Removes the correlation_id from `correlation_to_item` after
firing (one-shot).
- Events with no matching item_handle (correlation_id not in map)
are dropped silently — no bogus item_handle=0 events.
- Task cancelled on LmxClient drop via `JoinHandle::abort` (matches
the existing `subscription_task` pattern).
- New unit tests cover: Write op routes to on_write_complete, Read
op routes to on_operation_complete, unknown correlation_id is
dropped.
Result: the C# `LMX_OnWriteComplete(int hLMXServerHandle, int
phItemHandle, ref MXSTATUS_PROXY[] pVars)` callback shape is now
end-to-end-achievable. A consumer calls `LmxClient::write(hServer,
hItem, value, userId)` and drains `client.on_write_complete()`; the
yielded `WriteCompleteEvent` carries the right `(server_handle,
item_handle, statuses, is_during_recovery)` tuple.
Public API: `Session::write_with_handle` + `WriteHandle` are new;
existing signatures unchanged. `cargo public-api` baselines
regenerated under `design/public-api/{mxaccess,mxaccess-compat}.txt`.
Workspace: 765 → 823 tests pass (~58 new tests from F54). Clippy
`-D warnings` clean. Rustdoc `-D warnings` clean.
F54 status in `design/followups.md` moved Open → Resolved.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
|
|
f98ab9846d |
design/70-risks: record the .NET reference's WriteCompleted half-implementation
R3's verdict gains an aside documenting why the original native MxAccess `OnWriteComplete` event has historically only fired for the one exact 5-byte pattern `00 00 50 80 00` (= `MxStatus.WriteCompleteOk`). Verified at: - `src/MxNativeClient/MxNativeCompatibilityServer.cs:756` — `if (!evt.Message.IsMxAccessWriteComplete) return;` gates the consumer-facing `WriteCompleted` event. - `src/MxNativeCodec/NmxOperationStatusMessage.cs:18` — `IsMxAccessWriteComplete` requires `Format == StatusWord && StatusCode == 0x8050 && CompletionCode == 0x00`. Every other completion frame is silently dropped — the 1-byte `0x00`/`0x41`/`0xEF` ones, plus any non-success status word. This was the underlying reason R3/R4 looked unsolvable for a year: the answer "we don't know how to map" was actually "the native compatibility shim deliberately doesn't map these because firing typed failure events on ambiguous bytes was never a goal." Path A's `MxStatus::from_packed_u32` (commit `c73a33e`) closes the asymmetry on the Rust side: `Session::operation_status_events()` exposes ALL typed outcomes the upstream synthesizer produces, not just the WriteCompleteOk slice. The Rust port now has strictly broader operation-status visibility than the .NET reference offered. Recorded so future contributors don't re-derive this from scratch. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
c73a33edd8 |
[R3/R4 Path A] mxaccess: port Lmx.dll FUN_10100ce0 synthesizer kernel
Path A landed for R3/R4. The byte->MxStatus synthesizer in Lmx.dll is
FUN_10100ce0 (`analysis/ghidra/exports/Lmx.dll.synthesizer-helpers2-decompile.md`),
a 4-byte u32 LE -> 4-tuple MxStatus decoder used by every NMX-frame
parser in Lmx.dll. The kernel is byte-deterministic and context-free,
so it ports as a pure function -- the operation-tracking state
machine the original verdict deferred is NOT required for synthesis.
Bit layout (per FUN_10100ce0 lines 21-24):
bit 31: success (-1 if set, 0 if clear)
bits 27..24: category (4 bits)
bits 23..20: detected_by (4 bits)
bits 15..0: detail (i16 -- low 16 bits, signed)
bits 30..28, 19..16: reserved/padding
Codec changes:
- MxStatus::from_packed_u32() / ::to_packed_u32() -- the kernel +
inverse for round-trip parity.
- MxStatus::from_nmx_response_code() -- the constructed-from-response-
code switch in FUN_1010bd10:741-770 (six proven mappings: 0x01, 0x02
-> CommunicationError + RequestingNmx; 0x03 -> ConfigurationError +
RequestingNmx; 0x04 -> ConfigurationError + RespondingNmx; 0x05 ->
CommunicationError + RespondingNmx; 0x1A -> CommunicationError +
RequestingNmx).
- MxStatusCategory / MxStatusSource: from_i16/to_i16 promoted to const
fn so MxStatus::from_packed_u32 can be const.
- NmxOperationStatusMessage::try_parse_process_data_received_body() --
thin wrapper that peels the outer NmxObservedEnvelope before
delegating to try_parse_inner. Mirrors
NmxOperationStatusMessage.TryParseProcessDataReceivedBody (.NET cs:20-32).
- NmxOperationStatusMessage::promote_to_typed() -- entry point that
returns the existing Status field. Documented as a no-op pass-through
for now (the 5-byte inner-body wire shape is NOT the same field as
the 4-byte packed-u32 the kernel decodes); kept for API symmetry.
- 22 new round-trip tests covering the kernel, the response-code
switch, the proven 0x00/0x41/0xEF completion bytes, and round-trip
for every canonical sentinel.
mxaccess (Session) changes:
- New OperationKind enum (Write/WriteSecured/Read/Subscribe/
Unsubscribe/Activate/Suspend/Other).
- New OperationContext struct (correlation_id, op_kind, reference,
retry_count) -- ground for the F54 follow-on per-operation
correlation work.
- New OperationStatus event type {raw, status, context,
is_during_recovery}, mirroring MxNativeOperationStatusEvent (cs:73-78)
with the typed-MxStatus addition.
- Session::operation_status_events() -> broadcast::Receiver<Arc<
OperationStatus>> + operation_status_stream() Stream variant.
- callback_router() now tries operation-status parsing first, falling
through to subscription messages -- matches MxNativeSession
.OnCallbackReceived dispatch order (cs:574,582,590).
- recover_connection() flips a recovery_active counter (Arc<AtomicU32>
shared with the router) so OperationStatus.is_during_recovery is
populated correctly. Mirrors MxNativeSession._recoveryActive
Volatile.Read at cs:573.
- 3 new router tests covering: status-word frame dispatch + typed
promotion to WriteCompleteOk; completion-only frames stay verbatim;
is_during_recovery is stamped from the live counter.
Per-operation context tracking (correlating completion frames back to
outstanding writes/subscribes via the correlation_id) is filed as F54
in design/followups.md. The synthesizer kernel itself is byte-
deterministic, so the kernel and the correlation work are decoupled.
Ghidra evidence (the next-ring xref walk beyond FUN_10114a90):
- analysis/ghidra/exports/Lmx.dll.set-attribute-result-xrefs.md --
xrefs to OnSetAttributeResult / CancelWithStatus / OperationComplete.
- analysis/ghidra/exports/Lmx.dll.vtable-data-xrefs.md -- vtable-slot
data xrefs for the virtual-dispatch path.
- analysis/ghidra/exports/Lmx.dll.synthesizer-decompile.md --
ScanOnDemandCallback::OperationComplete/MultipleOperationComplete
(FUN_1010b990), RemotePlatformResolver::OperationComplete
(FUN_1010dc80), and the constructed-from-responseCode synthesizer
in FUN_1010bd10 (lines 698-770). FUN_1010bd10 is the wire-frame
receiver that drives the synthesis.
- analysis/ghidra/exports/Lmx.dll.synthesizer-helpers-decompile.md --
FUN_10003fc0 (the <success %d category %d ...> formatter; confirms
the 4-tuple layout), FUN_1008f150 (dispatch helper).
- analysis/ghidra/exports/Lmx.dll.synthesizer-helpers2-decompile.md --
FUN_10100ce0 (the kernel itself), FUN_10100bc0 (3xu16 reader),
FUN_1005e580 (4-byte stream reader), FUN_1010ee00 (sister NMX-frame
parser using the same kernel).
- analysis/ghidra/exports/Lmx.dll.synthesizer-callers-xrefs.md --
caller graph; confirms the kernel is called from many wire-frame
parsers but each parser shares the single 4-byte decoder.
R3/R4 verdict updated in design/70-risks-and-open-questions.md from
"settled at verbatim-preserve" to "settled per Path A". F54 filed in
design/followups.md for the per-operation correlation work.
cargo build / test / clippy -D warnings / RUSTDOCFLAGS=-D warnings doc
all clean. cargo public-api baselines regenerated for mxaccess and
mxaccess-codec.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
|
|
460c61df43 |
[R3/R4] Path-A trace: synthesizer is in Lmx.dll's NMX-frame decoder
Five-stage Ghidra headless decompile traces the byte-to-MXSTATUS_PROXY synthesis path end-to-end across LmxProxy.dll and Lmx.dll. New evidence files committed alongside R3/R4 verdict update: - analysis/ghidra/exports/LmxProxy.dll.fire-event-xrefs.md - analysis/ghidra/exports/LmxProxy.dll.status-synthesis-decompile.md - analysis/ghidra/exports/LmxProxy.dll.mxstatus-safearray-decompile.md - analysis/ghidra/exports/Lmx.dll.set-attribute-result-decompile.md Layer-by-layer findings (bytes flow inward; synthesis flows outward): 1. `Lmx.aaDCT` at 0x10178fc0 is `SysAllocString(L"Lmx.aaDCT")` — a tracing category BSTR, not a table. 2. `MXSTATUS_PROXY` is a 16-byte marshalled struct (4 × i16 padded to i32 boundaries with Pack=4) — the OUTPUT of synthesis, not a lookup entry. 3. `LmxProxy.dll` Fire_* event handlers receive already-populated `MXSTATUS_PROXY[]` and forward through ATL dispatch — no synthesis. 4. `LmxProxy.dll` Fire_* CALLERS (FUN_1001657f / FUN_10016b50 / FUN_10016d4b) call FUN_10003f60(out_safearray, in_status_ptr, count=1) which is a VERBATIM memcpy of an existing 14-byte buffer into the SAFEARRAY — no transformation. 5. `Lmx.dll`'s `PreboundReference::OnSetAttributeResult` (FUN_10114a90) receives an already-populated `short *param_7` status buffer. Log line confirms the layout: `<success %d category %d detectedBy %d detail %d>`. Dispatches on typed values — synthesis is upstream of this function too. The synthesizer is the NMX-frame decoder in Lmx.dll that calls OnSetAttributeResult / OnGetAttributeResult / equivalent OperationComplete handler. The decoder takes raw NMX bytes plus operation context (item handle, engine state, retry state, correlation id) and computes the populated MXSTATUS_PROXY. There is NO static lookup table — synthesis is per-message contextual. Two viable paths to typed promotion (both substantial; neither a small codec patch): - Path A: port the synthesizer. ~1-2 weeks. Requires extending the Rust session to track per-operation context (handles, retries, correlation ids). Out of V1 scope. - Path B: empirical capture pairs. ~30 min × 6-10 scenarios. Output is a (byte, context → status) mapping that approximates without re-implementing. Risk: mapping is only valid for captured contexts. R3/R4 stay settled at verbatim-preserve. The .NET reference does the same for the same reason: the synthesizer is too context- dependent to mirror without porting the entire operation-tracking state machine. Reopen criteria sharpened: either (a) a consumer files a concrete use case for typed promotion of a specific byte+context combination (Path B's empirical capture for that one combination is the cheapest answer); or (b) a major-version bump justifies the state-machine port (Path A). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
4dfc0cee65 |
[R3 + R4 + R8] settle protocol-level risks per Ghidra evidence
Ghidra headless decompile of `Lmx.dll`'s `aaDCT` symbol + the `LmxProxy.dll` Fire_* event handlers (logs at `analysis/ghidra/exports/Lmx.dll.aadct-decompile.md` and `analysis/ghidra/exports/LmxProxy.dll.completion-status-decompile.md`) settles **R3** and **R4** as "no static byte→status lookup table exists": - `Lmx.aaDCT` at `0x10178fc0` is a `SysAllocString(L"Lmx.aaDCT")` into a global BSTR — a logging category name, not a table. - `MXSTATUS_PROXY` is a 4-field struct (success/category/detectedBy/ detail), used as the marshalled COM event payload — not a static array of pre-mapped statuses. - `Fire_OnDataChange` / `Fire_OnWriteComplete` / `Fire_OperationComplete` / `Fire_OnBufferedDataChange` (RVAs 0x15f72, 0x1611f, 0x16271, 0x163c0 in `LmxProxy.dll`) receive ALREADY-POPULATED `MXSTATUS_PROXY[]` arrays — the byte-to-struct synthesis happens upstream in the proxy's NMX-callback ingestion code, not via a table lookup. The synthesis is per-event computation from operation context (engine ids, item handles, retry counters), not a static promotion. R3/R4 status updated from "indefinitely deferred — no Ghidra table" to "settled — no table exists; verbatim preservation is the canonical answer." The .NET reference's `NmxOperationStatusMessage.TryParseInner` + the Rust port's `mxaccess-codec/src/operation_status.rs` already match this canonical behaviour; no code change required. Reopen R3/R4 only if a context-aware capture surfaces a per-byte synthesis logic that depends on operation context — at which point the codec would need access to the originating operation's context, which is upstream of the bytes themselves. **R8** marked permanently deferred — implementation already parses NTLM AV pairs per [MS-NLMP] §2.2.2.1 (including the cross-domain shapes `MsvAvDnsTreeName` / `MsvAvDnsComputerName` carrying the trusted-domain DNS suffix), what's missing is the live capture, and the live capture requires a multi-domain Windows lab not available on this dev host. Same status pattern as F3 in `design/followups.md`. Open evidence gaps table updated to reflect: - Cross-domain NTLM: deferred (R8) - Ghidra mapping table for completion-only bytes: no table exists (R3/R4 settled) - Activate/Suspend transition (wire): partial (F44 + F46), live re-run pending (F50) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
0e93e3a8fa |
design/followups: file F48-F53 for known V1 residuals
After the M6 closeout sweep (F35-F47 all resolved), six residuals remain that were either documented inline in design docs or implied by closure language but not formally tracked as F-numbers. File them explicitly so the next iteration has a clean starting list: - F48: Execute `cargo publish` for V1 (F43 was dry-run only). Documents the 9-crate dependency-ordered publish sequence + the version bump from 0.0.0 placeholder to 0.1.0. - F49: Live verification sweep for F36 (buffered subscribe) + F45 (recovery replay) + F47 (unsubscribe skip) + F40 (metrics). Closes the gap where these features ship with unit tests but weren't live-exercised against AVEVA in the closing iteration. - F50: Run the F46 Suspend/Activate Frida capture live (script ready, capture deferred to maintainer-side). - F51: Live type-matrix expansion for `asb-subscribe` — Bool / Float / Double / String / DateTime / Duration / arrays. F32 was closed via "deployable maximum" but the codec supports more types than the live matrix exercises. - F52: Codec performance optimisations from F39 (BytesMut output buffer, name-signature cache, session scratch pool). Documented as post-V1 in M6-bench-baseline.md; filing them as F-numbers so the alloc-count deltas are tracked when they land. - F53: Enable `#![warn(missing_docs)]` workspace-wide. Deferred from F42 — the lint surfaces hundreds of low-priority gaps that need a dedicated pass. R3, R4, R8, F3 already in their respective tracking docs (the risks register + the Open section's permanently-external-blocked entry). Open section now contains: F3 (permanently external-blocked) + F48-F53 (V1-residual triage). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
25befcb72e |
design/followups: move F45 + F47 to Resolved (M6 + spawned closures)
F45 (commit |
||
|
|
2281309a86 | design/followups: move F46 to Resolved (Frida hooks landed) | ||
|
|
808fea18a0 |
[F46] analysis/frida: Suspend/Activate hooks + R5 next-step
Closes the wire-side gap left by capture 077 in F44's R5 walk. The Frida
script now hooks the production LmxProxy.dll dispatchers so a future live
re-run on the AVEVA host can answer "does CLMXProxyServer issue a separate
ORPC method for Suspend/Activate, or are they synthesised client-side?"
Hooks added in `analysis/frida/mx-nmx-trace.js`:
- `LmxProxy.dll!CLMXProxyServer.Suspend` @ RVA 0x13d9c (FUN_10013d9c)
- `LmxProxy.dll!CLMXProxyServer.Activate` @ RVA 0x14028 (FUN_10014028)
Both RVAs were extracted from
`analysis/ghidra/exports/LmxProxy.dll.string-refs.tsv` rows 119/122 (the
`CLMXProxyServer::Suspend - Server Handle` / `Activate - Server Handle`
log strings each xref one function — same pattern as the existing
AdviseSupervisory hook at 0x142b4). The hooks emit `mx.suspend.begin/end`
and `mx.activate.begin/end` events with serverHandle, itemHandle, and the
`MxStatus*` out parameter decoded as 4 x int16 (Success / Category /
DetectedBy / Detail per `src/MxNativeCodec/MxStatus.cs`). Naming matches
the F46 spec's `mx.<verb>.begin / end` grep convention rather than the
generic `call.enter / leave` shape because we want to filter these out
of large traces without false positives from other LmxProxy entrypoints.
No `Resume` / `Reactivate` exports exist in `LmxProxy.dll` — verified
against `analysis/ghidra/exports/LmxProxy.dll.ghidra.md` (no such string
xrefs) and the decompiled `ILMXProxyServer5` / `ILMXProxyServer4`
interfaces under `analysis/decompiled-mxaccess/ArchestrA/MxAccess/`
(only Suspend and Activate are declared on the dispatch interface).
The script's top-of-file comment now carries the live re-run procedure
(rebuild MxTraceHarness x86, attach Frida with `--scenario=suspend-advised`
then `--scenario=activate-advised`, save under
`captures/NNN-frida-suspend-activate-instrumented/`, grep the new TSV for
`mx.suspend.*` / `mx.activate.*` and correlate with `nmx.enter` events
in the same time window). Live capture is intentionally deferred to the
maintainer per the F46 spec — this dev box has no AVEVA install.
`design/70-risks-and-open-questions.md` R5 status updated:
- Title flag `(filed as F45)` -> `(filed as F46, hook landed pending live re-run)`
(the docs/M6-buffered-evidence.md footnote referenced F45 from before
F45 / F46 were de-conflicted by commit
|
||
|
|
c7e71e4424 |
design/followups: move F41 + F43 to Resolved (M6 complete)
All 10 M6 sub-followups (F35-F44 minus the ones absorbed into F44) plus F41 + F43 are now resolved. Open section narrows to: - F45: buffered recovery replay (sub-followup of F36) - F46: Suspend/Activate wire emission (sub-followup of F44) - F3: cross-domain NTLM fixture (permanently external-blocked) M6 closeout: see CHANGELOG.md for the V1 release notes. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
9e57bfd451 |
[F41 + F44 reconciliation] cargo public-api baselines + multi-record DataUpdate codec
**F41 — public-api baselines (M6 DoD bullet 5)**
`design/public-api/{crate}.txt` for all 9 workspace crates, generated
via `cargo +nightly public-api --simplified -p <crate>`. Per-crate
baseline sizes:
- mxaccess-codec: 2516 lines
- mxaccess-asb: 1258 lines
- mxaccess-rpc: 1273 lines
- mxaccess-asb-nettcp: 708 lines
- mxaccess: 542 lines
- mxaccess-galaxy: 374 lines
- mxaccess-callback: 170 lines
- mxaccess-compat: 123 lines
- mxaccess-nmx: 118 lines
`design/public-api/README.md` documents the update procedure
(install nightly + cargo-public-api, regenerate the affected baseline
on intentional API changes, commit alongside).
`.github/workflows/rust.yml` gains a `public-api` job that runs the
same diff against the committed baseline; drift fails CI with a
unified diff in the log so the PR author can either revert or
update the baseline.
**F44 reconciliation — multi-record DataUpdate codec**
Cherry-picked from the F44 sub-agent's worktree (commit `aec6a0c`):
`subscription_message.rs::parse_data_update` now loops over
`record_count` like `parse_subscription_status` does, accepting any
positive count. The .NET reference still hard-throws on
`record_count != 1`; the Rust codec deliberately diverges per the F44
evidence walk against `captures/094-frida-buffered-separate-writer/
frida-events.tsv:145` (a `0x33` DataUpdate body with `record_count = 2`,
inner_length = 23 (preamble) + 2 * 19 (records) = 61, post a
separate-session writer triggering two value changes inside one
`SetBufferedUpdateInterval(1000)` window).
Two new round-trip tests:
- `data_update_multi_record_round_trip` — synthesises a 2-record body,
parses, asserts both records decode to expected Int32 values.
- `data_update_capture_094_truncated_record_errors` — truncates the
capture-094 fixture mid-second-record, asserts CodecError::Decode.
New wire-byte fixtures under `crates/mxaccess-codec/tests/fixtures/m6-buffered/`:
- `094-line145-dataupdate-recordcount2.bin` (57 bytes, `0x33` multi-record)
- `094-line48-substatus-recordcount2.bin` (101 bytes, `0x32` multi-record)
R2 in `design/70-risks-and-open-questions.md` updated from
"single-sample (settled silently)" to "settled per option (a) — codec
relaxed; multi-record observed in production-stack tracing."
`design/followups.md`: F44's verdict updated to reflect the
contradiction-then-relaxation, with reference to the new tests +
fixtures.
Workspace 792 → 794 tests pass; clippy clean; rustdoc clean.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
|
|
2120dfa965 |
design/followups: move F35/F40/F44 to Resolved + de-conflict F45/F46
rust / build / test / clippy / fmt (push) Has been cancelled
After commits |
||
|
|
ad1cf2351c |
[F36 + F40 + F44] M6 wave 1: subscribe_buffered (NMX) + metrics + evidence
Three M6 sub-followups landed in this wave (sub-agent worktrees +
manual reconciliation in main):
**F36 — Session::subscribe_buffered (NMX) per R2 single-sample**
- `BufferedOptions::rounded_update_interval_ms()` — 100ms rounding
helper mirroring MxNativeCompatibilityServer.cs:638
((updateInterval + 99) / 100) * 100, saturating on overflow.
- `Session::subscribe_buffered` (public, lib.rs:604) delegates to
the new private `subscribe_buffered_nmx` which uses the buffered
RegisterReference path: item_definition suffixed with
`.property(buffer)`, subscribe=true (no separate
AdviseSupervisory follow-up — verified against capture 082).
- Per R2 verified at wwtools/mxaccesscli/docs/api-notes.md the wire
semantic is single-sample-per-event with a server-side cadence
knob; rounded_ms is held client-side only (native MXAccess does
not emit a separate SetBufferedUpdateInterval RPC, verified by
absence in 079/082 captures).
- New crates/mxaccess/examples/subscribe-buffered.rs.
- New crates/mxaccess-codec/tests/buffered_register_reference_parity.rs:
4 tests (capture 079/082 round-trip, suffix helper, constructive
forward-build vs capture 082).
**F40 — Optional metrics feature**
- New crates/mxaccess/src/metrics.rs (275 lines): `pub(crate)`
thin wrappers (`record_write_latency`, `record_read_latency`,
`inc_writes`, `inc_reads`, `inc_advises`, `inc_recovery_*`,
`set_active_subscriptions`, etc.) that compile to no-ops under
`#[cfg(not(feature = "metrics"))]`. Call sites in session.rs +
asb_session.rs invoke them unconditionally; the gate is inside
the wrapper.
- `metrics = { version = "0.24", optional = true }` added to
workspace + mxaccess crate Cargo.toml.
- Default build: zero metrics dep, zero runtime cost.
**F44 — Buffered batch + suspend capture decode evidence**
- New docs/M6-buffered-evidence.md: per-capture summary for
077, 079, 080, 081, 082, 094 — call sequence, key wire bytes,
R2/R5 verdict.
- R2 confirmed silently as "not a real risk" — single-sample
observed across 079/080/082/094.
- R5 trigger conditions documented from capture 077: AdviseSupervisory
+ Suspend pair, 1-second intervals, succeeds on enum attributes.
- design/70-risks-and-open-questions.md R2/R5 status updated.
Workspace: 759 → 792 tests, clippy clean, rustdoc -D warnings clean.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
|
|
a1c4c6203e |
design/followups: move F37/F38/F39/F42 to Resolved
rust / build / test / clippy / fmt (push) Has been cancelled
Four M6 sub-followups closed in this session — moved to Resolved section with concise verdicts referencing the matching commits: - F37 (commit |
||
|
|
71c69b80c6 |
[F38] mxaccess-codec: counting-allocator bench harness + R12 baseline
Hand-rolled GlobalAlloc wrapper around System that tracks allocs +
bytes + deallocs via two atomics. Each scenario runs 10k iterations
after a 1k warm-up; output is a markdown table with allocs/op,
bytes/op, deallocs/op.
Why hand-rolled (not dhat/criterion): R12 gates on a single number
("< 5 allocs/write"). dhat is heap-profiling-oriented (call-stack
attribution, JSON snapshots); criterion measures wall-clock latency
which is reported-but-not-gated per 60-roadmap.md:104. A 50-line
GlobalAlloc + atomic counters is the simplest thing that answers
the gate.
Run: `cargo bench -p mxaccess-codec`
Baseline numbers (release, Windows x64):
- Bool write: 1.00 allocs/op
- Int32 write: 2.00 allocs/op
- Float32 write: 2.00 allocs/op
- Float64 write: 2.00 allocs/op
- String write: 4.00 allocs/op (5-char string)
- Handle from_names: 2.00 allocs/op
- DataUpdate decode: 1.00 alloc/op
R12's < 5 allocs/write target is **already met** across the proven
matrix without any zero-copy work. The bench gates on this — any
write_message::encode scenario at >= 5 allocs/op exits the harness
with code 1.
Companion: `design/M6-bench-baseline.md` documents the numbers,
explains the per-scenario breakdown, and tightens F39's scope from
"hit the target" to "nice-to-have optimisations" (BytesMut output
buffer, name-signature cache, session-level scratch pool).
Workspace: 759 tests still pass; clippy --benches clean.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
|
|
2546710604 |
design/followups: add F35-F44 for M6 implementation plan
10 new followups decompose M6 (compatibility shim + production hardening) into parallel-safe sub-streams: - F35: mxaccess-compat LMXProxyServer-shaped facade (18 methods over Session/AsbSession) - F36: Session::subscribe_buffered NMX path per R2 single-sample - F37: ASB subscribe_buffered capability gate - F38: counting-allocator cargo bench harness for R12 target - F39: zero-copy codec pass (depends on F38) - F40: optional metrics feature - F41: cargo public-api baseline (depends on F35/F36/F37/F39/F40) - F42: cargo doc cleanup pass - F43: cargo publish --dry-run all crates (depends on F41) - F44: decode buffered batch + suspend captures (077, 079-082, 094) for R2/R5 evidence Parallelization: Wave 1 = F35/F36/F37/F38/F40/F42/F44 (different crates / different concerns); Wave 2 = F39 (needs F38's bench); Wave 3 = F41 (needs API stable); Wave 4 = F43 (release). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
bedad57b4e |
design/followups: move F18 (M5 meta-tracker) to Resolved
rust / build / test / clippy / fmt (push) Has been cancelled
Trim the planning content (sub-stream breakdown table, parallel-safety analysis, risk-driven sequencing, "Resolves when" gate) since M5 is done. Keep the closure verdict, M5 DoD checklist showing the actual state at close, sub-followup closeout list (F19-F26 + F28/F29/F30/ F31/F32/F33/F34), cumulative execution log, and the architectural note explaining why AsbSession stays parallel to the NMX Session rather than unified — that's load-bearing for future maintenance. Open section now contains only F3 (cross-domain NTLM Type1/2/3 fixture, permanently external-blocked on this single-domain dev host — resolution requires multi-domain Windows lab not available here). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
b1a5f5ff1e |
design/followups: move F34 to Resolved (live-verified closure)
The F34 entry's body had grown into a debugging notebook with five
"Open hypotheses" and a "Resolves when" speculation block — all of
which are now moot since the actual fix landed. Trim to the closure
verdict, the technical evidence (captured fixture dictionary, the
dual-format insight), and the bonus context discovered while
debugging (Active/SampleInterval/result_code 32 quirks). Move from
"## Open" to "## Resolved" with date + commit
|
||
|
|
101a8b13f5 |
[F34] mxaccess-asb: AddMonitoredItems body uses DataContract field names
Rewrite push_monitored_item_body to emit the DataContract field-suffix names from AsbContracts.cs:940-965 (activeField, bufferedField, itemField, sampleIntervalField, timeDeadbandField, userDataField, valueDeadbandField) under prefix `b` bound to the http://schemas.datacontract.org/2004/07/ArchestrAServices.ASBIDataV2Contract namespace. The <Items> wrapper now declares xmlns:b + xmlns:i. The legacy XmlSerializer property names (<Active>, <Item>, <SampleInterval>, <Buffered>) only matter for the canonical-XML HMAC signing input — that emitter at xml_canonical::emit_monitored_item is unchanged and F28 fixture byte-equality still holds for all 13 ops. On the binary NBFX wire MxDataProvider's DataContractSerializer expects the field-suffix form. Wire-byte type encoding matches the captured fixture (add-monitored-items-request-wire.bin): bool → Bool record, ulong → Zero/One/Chars (XmlConvert decimal text), ushort → Zero/One/Int8/Int16/Int32 (smallest-fit binary). Empty string? + null byte[]? emit as empty elements with no <i:nil> attribute (matching the wire). Field order follows the explicit [DataMember(Order = N)] sequence. Adjacent: ItemIdentity is nested via DataContract field names too — NOT the binary <ASBIData> fast-path, which only kicks in at top-level message body members. Verified live against AVEVA MxDataProvider: AddMonitoredItems now returns 1 status item with error_code=0x0000 (previously 0 items; the silent failure was the deliberate DC-schema mismatch); Publish poll #4 delivers the actual tag value as AsbVariant { type_id: 4, length: 4, payload: [99,0,0,0] } through the F26 stream. Pre-existing clippy::format_collect errors in auth.rs:339,342 and client.rs:952 fixed in passing — they were blocking workspace clippy otherwise. Workspace: 757 → 758 tests, clippy -D warnings clean. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
6762526f09 |
design/followups: mark F18 (M5 meta-tracker) resolved
All sub-followups F19-F26 closed; M5 is functionally LIVE end-to-end (asb-subscribe returns the real tag value over the wire). The structural-port followups F18 spawned (F2/F10/F11/F27 for NTLM / DCOM / RemUnknown / DH) all resolved separately. F18 stays under "## Open" as the cumulative-execution-log anchor; status line at the top now reflects the closed state so the open/resolved structure matches reality. Remaining open items: F34 (MonitoredItem wire format, P2 — needs nbfx auto-intern fix + DataContract field-suffix body builders) and F3 (cross-domain NTLM fixture, P2 — permanently external-blocked on this single-domain dev host). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
1de049e114 |
[F2] mxaccess-rpc: NTLM verify_signature (server-to-client) with constant-time MAC compare
rust / build / test / clippy / fmt (push) Has been cancelled
Closes F2. Structural port from [MS-NLMP] §3.4.4 — same shape as
the existing sign path but uses the server-to-client sub-keys
(`SealKey_S→C` / `SignKey_S→C`) derived alongside the client-to-
server pair at the end of create_type3.
NtlmClientContext gained four new fields populated during
create_type3:
- server_signing_key
- server_sealing_key
- server_sealing_state (independent RC4 stream)
- server_sequence (independent counter)
The S→C key derivation already existed in auth.rs (the seal_key /
sign_key helpers take a client_mode flag); F2 plumbs them into a
new verify_signature(message, signature) method.
The verify path:
1. Validates signature.len() == 16 + leading version word 0x01.
2. Reads trailing seq num, compares against self.server_sequence
(mismatch ⇒ InvalidSignature, no state change).
3. Computes expected_mac = HMAC_MD5(server_signing_key,
seq || message)[0..8] then RC4 transform.
4. Constant-time compares expected_mac against wire bytes 4..12
via subtle::ConstantTimeEq.
5. On success: commits cipher-state advance + ++server_sequence.
On failure: re-derives RC4 from server_sealing_key and skips
past server_sequence × 8 keystream bytes to restore the
pre-verify position — caller can retry.
New dep `subtle = "2"` (workspace-internal to mxaccess-rpc) for
the timing-oracle-safe MAC compare.
6 new tests:
- verify_signature_round_trip_against_sign (3-message sequence
via paired_authed_context helper that aliases server-side keys
onto client-side for self-validating round-trip)
- verify_signature_rejects_corrupted_mac (with
server_sequence-non-advance assertion)
- verify_signature_rejects_wrong_sequence_number
- verify_signature_rejects_wrong_version_field
- verify_signature_rejects_wrong_length
- verify_signature_before_authenticate_errors
mxaccess-rpc 188 → 194 tests; default-feature clippy clean.
The "awaiting wire-fixture capture" step listed in F2's prior
status note is no longer a hard prerequisite — [MS-NLMP] §3.4.4
fully defines the algorithm and the round-trip tests prove the
encoder/decoder pair is internally consistent. A captured
StatusReceived frame would still validate byte-parity vs a real
NmxSvc.exe signer, but that's future verification work; the
structural port ships unblocked.
design/followups.md F2 moved to Resolved.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
|
|
161b0cedfa |
[F10 + F11] mxaccess-rpc: structural ports for ResolveOxid2 + RemAddRef/RemRelease
rust / build / test / clippy / fmt (push) Has been cancelled
Closes F10 and F11 per option (b) of each followup's resolve
criterion: hand-rolled body codecs derived from the [MS-DCOM]
spec, ship structurally with no live fixture (the .NET reference
doesn't call these opnums), validate against captured frames when
they become available.
F10 — IObjectExporter::ResolveOxid2 (opnum 4):
Per [MS-DCOM] §3.1.2.5.1.4. New parse_resolve_oxid2_result
mirrors parse_resolve_oxid_result exactly except for the extra
4-byte COMVERSION slot (u16 major + u16 minor) between
authn_hint and error_status. Trailing-fields check tightens
from 24 bytes (opnum 0) to 28 bytes (opnum 4). New ComVersion +
ResolveOxid2Result types. referent_id=0 short-circuit returns
empty bindings + default ComVersion + tail-status, matching
opnum 0's pattern.
F11 — IRemUnknown::RemAddRef + RemRelease (opnums 4 and 5):
Per [MS-DCOM] §3.1.1.5.6 + §2.2.19 (REMINTERFACEREF). Both
opnums share the wire shape, so:
- encode_rem_add_ref_request + encode_rem_release_request
both delegate to a shared encode_remref_array_request
helper.
- parse_remref_response is shared between them too — they
have an identical OrpcThat + referent_id + max_count +
N×HRESULT + error_code layout.
New RemInterfaceRef struct (ipid + public_refs + private_refs,
ENCODED_LEN = 24) + RemRefResponse decoded shape.
8 new structural tests across both modules pin every documented
edge of each shape — short stubs, referent-zero short-circuits,
truncated-trailing detection, full multi-element round-trips.
mxaccess-rpc 183 → 188 tests; default-feature clippy clean.
Both followups documented "**Status:** Awaiting wire-fixture
capture" prior to this commit; the structural-port option was
explicitly listed as resolution path (b) in each. Future captured
frames will validate the byte-by-byte match — until then the
port is byte-faithful to the spec but unverified against a live
peer (which is fine for shipping since neither opnum is on the
NMX session's hot path).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
|
|
4ed1355761 |
design/followups: rewrite F2/F3/F10/F11 with concrete next-step recipes
Each remaining open followup now lists the precise "Concrete next step" to close it — what to capture, where to write the fixture, which file to edit. Future sessions (or anyone without the project context) can pick up any of these and execute without guessing. F2 (NTLM verify_signature server→client): Status: awaiting wire-fixture capture. M2 wave 3 (callback exporter) is closed under F15, so the path is wired — instrument CallbackExporter to hex-dump inbound StatusReceived bytes during a live subscribe, save under tests/fixtures/m2-status-frame/, port verify_signature mirroring `sign` but using the server-to-client sub-keys per [MS-NLMP] §3.4.4, add `subtle = "2"` for constant-time MAC compare. F3 (cross-domain NTLM Type1/2/3 fixture): Status: permanently out-of-scope on this host (no second AD domain). Documented the lab-environment requirements and the capture procedure for whoever provisions the two-domain harness. F10 (IObjectExporter::ResolveOxid2 opnum 4): Status: awaiting capture or .NET helper. Two paths documented — extend MxNativeClient.Probe with --probe-resolve-oxid2 OR hand-roll the layout from [MS-DCOM] §3.1.2.5.1.4 and validate later. F11 (IRemUnknown::RemAddRef + RemRelease): Status: same shape as F10. Document either probe extension or structural port from [MS-DCOM] §3.1.1.5.6 (REMINTERFACEREF[]). No code changes in this commit — purely sharpening the followup specs so each one's resolution recipe is self-contained. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
9496322712 |
[F27] mxaccess-asb-nettcp: constant-time DH mod_exp via crypto-bigint::DynResidue
rust / build / test / clippy / fmt (push) Has been cancelled
Closes F27 per option (b) of its resolve criterion: fixed-width
U2048 DH backend using crypto-bigint's Montgomery-form residue
arithmetic.
New auth.rs::constant_time_mod_exp(base, exp, modulus) wrapper
preserves the BigUint-in-BigUint-out API of the existing byte
helpers; the actual square-and-multiply chain runs in Montgomery
form. Both DH call sites swap to the wrapper:
- AsbAuthenticator::new line 179 (public-key generation)
- crypto_key line 354 (shared-secret derivation)
DH private exponent timing-leak resistance is the goal: the .NET
reference's BigInteger.ModPow is also non-constant-time, so we
were at parity but not at the long-term Rust target. With this
fix the production path no longer leaks the bit-pattern of the
long-lived DH private key through power/timing side channels.
DynResidueParams::new requires an odd modulus (Montgomery form's
only restriction). Production DH primes are always odd
(`MX_ASB_DH_PRIME = 1552...7919` on this host's registry).
CryptoParameters::DEFAULT_PRIME_TEXT — the test-fixture default
inherited from AsbRegistry.cs:66 — ends in 4 (even), which is
mathematically unsound for DH but kept for parity with the .NET
default. For that case the wrapper falls back to BigUint::modpow,
preserving the wire bytes (modular exp is a pure function of
inputs).
Wire-byte parity verified two ways:
1. Unit fixture test
`auth.rs::deterministic_hmac_matches_dotnet_fixture` — byte-equal
to captured .NET output for the full DH → PBKDF2 → AES-CBC chain.
Continues to pass.
2. Live: Connect handshake against the local AVEVA install
completes with apollo:V2 lifetime, proving MxDataProvider
accepts the constant-time-derived public key and the
shared-secret-based AuthenticateMe.
Workspace deps:
- crypto-bigint = "0.5" added to [workspace.dependencies] and
mxaccess-asb-nettcp/Cargo.toml.
- num-bigint retained for decimal-string parsing + .NET-LE byte
conversion (crypto-bigint has neither).
Closes the "review.md MAJOR finding" originally flagged at
design/30-crate-topology.md:269-274.
design/followups.md: F27 moved to Resolved.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
|
|
d03bd04ef5 |
[F34 evidence] dump WCF binary-header dictionary for AddMonitoredItems
rust / build / test / clippy / fmt (push) Has been cancelled
Extends tests/add_monitored_items_request_capture.rs with a manual binary-header walk that prints every pre-interned string + its wire id. The captured request's binary header pre-declares **23 strings** covering the entire DataContract field set: wire-id 1 http://ASB.IDataV2:addMonitoredItemsIn wire-id 3 AddMonitoredItemsRequest wire-id 5 SubscriptionId wire-id 7 Items wire-id 9 http://schemas.datacontract.org/.../ASBIDataV2Contract wire-id 11 MonitoredItem wire-id 13 activeField wire-id 15 activeFieldSpecified wire-id 17 bufferedField wire-id 19 itemField wire-id 21 contextNameField wire-id 23 idField wire-id 25 idFieldSpecified wire-id 27 nameField wire-id 29 referenceTypeField wire-id 31 typeField wire-id 33 sampleIntervalField wire-id 35 timeDeadbandField wire-id 37 timeDeadbandFieldSpecified wire-id 39 userDataField wire-id 41 lengthField wire-id 43 payloadField wire-id 45 valueDeadbandField That gives F34's binary-builder rewrite the exact dict-id mapping to target — every MonitoredItem child can be emitted as a DictionaryStatic(odd-id) reference instead of an inline string, matching WCF's compression. The "RequireId" mystery from the earlier inline-name decode is also resolved: the wire body has NO `RequireId` element at the bottom — the trailing `Inline("referenceTypeField")` was a dict-id wraparound or auto-intern artifact, not actual content. design/followups.md F34 updated with the full ground-truth header, plus a refined "Resolves when" pointing at the underlying `nbfx.rs::decode_tokens` auto-intern semantics. The current codec's doc comment ("the codec doesn't auto-intern") is correct for raw [MC-NBFX] but wrong for WCF binary messages where the writer auto-interns by convention; that's the structural fix the F34 binary rewrite depends on. No code-path change in this commit beyond the test improvements. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
b66f5bb018 |
[F34 evidence] capture AddMonitoredItems request wire + decoder trace
rust / build / test / clippy / fmt (push) Has been cancelled
Investigation continued via examples/asb-relay.rs middleman:
captured the .NET probe's verbatim AddMonitoredItems request bytes
(695 bytes with the 3-byte NMF SizedEnvelope header). Saved at
rust/crates/mxaccess-asb/tests/fixtures/add-monitored-items-request-wire.bin
as the ground-truth shape MxDataProvider actually accepts.
New tests/add_monitored_items_request_capture.rs runs decode_envelope
over the capture and dumps every NBFX token to stderr for inspection.
Decoded trace surfaces a SECOND, deeper issue:
The F30 dynamic-dict-resolution post-pass at
envelope.rs::resolve_dict_names_in_tokens mis-maps per-session dict
ids. Decoding the captured request renders namespace-URL slots as
field-name strings:
body[1]=DefaultNamespace { value: Chars("nameField") } ← bogus
body[7]=NamespaceDeclaration { prefix: "i",
value: Chars("activeField") } ← bogus
and leaves most element names as `Static(NN)` instead of resolving
to inline names like `activeField` / `bufferedField` / `itemField`.
This blocks F34's substantive fix (rewrite
build_add_monitored_items_request_body to use DataContract
field-suffix names matching the wire). We can't validate the
rewritten builder against the captured fixture until the dict
post-pass produces the right strings.
design/followups.md F34 updated with two-prerequisite resolution
plan:
1. Fix the F30 dynamic-dict resolution so the captured request
decodes to recognisable inline names.
2. Rewrite the AddMonitoredItems / DeleteMonitoredItems builders
against the now-readable structure (DataContract field names
+ namespace prefixes for ASBIDataV2Contract / ASBContract +
nested DataContract serialization of ItemIdentity inside
`<itemField>` and Variants inside userDataField /
valueDeadbandField).
Workspace: mxaccess-asb 96 → 97 (+1 capture-driven analysis test);
default-feature clippy clean. The HMAC canonical-XML signing path
remains correct (F28 fixtures are byte-equal to .NET); only the
binary NBFX wire body needs the rewrite.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
|
|
fb40e4c20b |
[F34 partial] mxaccess-asb: fix collect_asbidata_payloads + add Active flag
rust / build / test / clippy / fmt (push) Has been cancelled
Investigation via examples/asb-relay.rs middleman captured the full
S→C bytes of a working PublishResponse from the .NET probe against
MxDataProvider. Decoder fix verified by regression test against the
captured fixture; one further wire-format gap surfaced and is filed.
Closed in this commit:
1. collect_asbidata_payloads filtered out empty <ASBIData/> elements
so positional payload[N] indexing collapsed when Status was
empty-but-present. The wire form for PublishResponse is:
<Status><ASBIData/></Status> ← empty placeholder
<Values><ASBIData>{bytes}</ASBIData></Values>
Our decoder lost the positional info and read Values as Status,
then panicked on the malformed parse. Fix: always push every
<ASBIData> element (empty or not) so payloads[0]=Status and
payloads[1]=Values stay aligned. New regression test
tests/publish_capture.rs runs the full decode chain over the
captured wire bytes (305-byte frame at
tests/fixtures/publish-response-with-value.bin) and asserts
values.len() == 1.
2. MinimalMonitoredItem.active: Option<bool> + new with_active()
constructor. The .NET reference's MxAsbDataClient.AddMonitoredItems
defaults to active: true (cs:441). Without <Active>true</Active>
on the wire, MxDataProvider treats the subscription as inactive
and Publish polls return empty Values. Both binary build and
canonical XML emitters now conditionally emit <Active> when
active.is_some(). Shared push_monitored_item_body helper
eliminates the duplicate MonitoredItem encoder between
AddMonitoredItems and DeleteMonitoredItems builders.
3. SampleInterval unit: clarified as **milliseconds** in
MinimalMonitoredItem.sample_interval doc + the example
(sample_interval_ticks → sample_interval_ms = 1000). Matches the
.NET reference's `ulong sampleInterval = 1000` default.
Open: F34's deeper finding — `MonitoredItem`'s wire schema is
DataContract field-suffix names (`activeField`, `bufferedField`,
`itemField`, `sampleIntervalField`, etc., per the per-session NBFX
dictionary the .NET probe declares), NOT XmlSerializer property
names (`Active`, `Buffered`, `Item`, `SampleInterval`). Our binary
NBFX builder still uses the property names, so MxDataProvider
silently fails to register monitored items — successField=true with
a 0-length Status array. The fix needs a complete rebuild of
build_add_monitored_items_request_body and
build_delete_monitored_items_request_body to use the field-suffix
names plus emit the *Specified siblings (activeFieldSpecified,
idFieldSpecified, etc.) as their own elements. The HMAC canonical
XML side is unaffected (XmlSerializer naming is correct there;
verified byte-equal to .NET via the F28 fixtures). Detailed in
design/followups.md F34's "Open" section.
Live verification of the F34-partial bonus context:
- Read still returns 99 end-to-end via canonical XML signing.
- AddMonitoredItems still returns Status[0] = 0 items
(server doesn't recognize our DataContract-misnamed payload).
- Publish still returns 0 values (the F34-open consequence).
- All other 13 canonical-XML signed ops succeed at the request
level (no SOAP faults, no HMAC rejections).
Workspace: mxaccess-asb 95 → 96 (+1 capture-driven decoder test);
default-feature clippy clean.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
|
|
0771664092 |
asb: SampleInterval unit fix + F34 followup for Publish-decoder gap
rust / build / test / clippy / fmt (push) Has been cancelled
Investigation triggered by "Publish returns 0 values where .NET sees real
values" against the local AVEVA install.
Three findings:
1. SampleInterval unit: the wire field is **milliseconds**, not 100-ns
ticks. The .NET reference (MxAsbDataClient.cs:441) defaults to
`ulong sampleInterval = 1000` and the probe passes `subscribeSampleMs`
directly through that surface. Sending 10_000_000 (1s in 100-ns ticks)
makes MxDataProvider schedule the next sample ~2.8 hours out; Publish
polls always come back empty until the misinterpreted timer expires.
Fixed in `examples/asb-subscribe.rs` (sample_interval_ticks →
sample_interval_ms = 1000) and clarified in
`MinimalMonitoredItem.sample_interval`'s doc comment with the live-2026-05-06
evidence.
2. result_code=32 is `AsbErrorCode.PublishComplete`
(`AsbResultMapping.cs:37`) — informational, not a fatal error. .NET's
`ToResult` (cs:122-129) explicitly treats it like Success.
`ArchestrAResult.ErrorCode` and `ResultCode` are aliases for the same
`resultCodeField` (cs:424-434), so `publish[i]_error=0x00000020` in
the .NET probe trace = `result_code=Some(32)` in our trace = the same
thing. Already handled correctly via the F26 narrower-bail fix
(commit
|
||
|
|
34d477819b |
[F28] mxaccess-asb: canonical XML signing for all 8 remaining ops
rust / build / test / clippy / fmt (push) Has been cancelled
Closes F28. The 5 [XmlSerializerFormat] ops landed in commit
|
||
|
|
ff4ea4d5a9 |
[F16] mxaccess: real Session::recover_connection (re-bind + re-advise)
rust / build / test / clippy / fmt (push) Has been cancelled
Closes F16. Replaces the wave-2 no-op recover_connection with the
full .NET-equivalent shape (MxNativeSession.cs:399-474). Three
pieces:
1. Subscription registry on SessionInner.
New subscriptions: Mutex<HashMap<[u8; 16], SubscriptionEntry>>
tracks every active advise. subscribe() inserts after a successful
AdviseSupervisory; unsubscribe() removes on the success path only
(failed UnAdvises stay registered so next recovery replays them).
The consumer's Subscription handle still holds the BroadcastStream;
the registry is purely for AdviseSupervisory replay.
2. Pluggable RebuildFactory.
New public typedef:
pub type RebuildFactory = Arc<
dyn Fn() -> Pin<Box<dyn Future<Output = Result<NmxClient,
NmxClientError>>
+ Send>>
+ Send + Sync,
>;
Installed via Session::set_recovery_factory(factory);
queryable via has_recovery_factory(). Kept separate from
connect_nmx / connect_nmx_auto so existing constructors stay
non-breaking — consumers opt in by calling the setter
after-the-fact.
3. Real recover_connection + recover_connection_core.
recover_connection is the retry loop (mirrors cs:399-440): for
attempt in 1..=policy.max_attempts, emit RecoveryEvent::Started
→ call recover_connection_core → on Ok emit Recovered + return,
on Err emit Failed{will_retry, error}, sleep policy.delay, retry,
or bubble the last error.
recover_connection_core mirrors cs:442-474: rebuild NMX via the
factory → RegisterEngine2 with the saved callback_obj_ref → optional
SetHeartbeatSendInterval → snapshot the registry under the lock,
replay AdviseSupervisory(correlation_id) for each entry → atomically
swap *nmx_lock = replacement. Old NmxClient drops at end of scope,
closing its TCP transport.
Subscription correlation ids are preserved across the swap so the
consumer's Subscription stream continues to receive on its existing
broadcast filter. The CallbackExporter stays bound across recoveries
— no TCP listener re-bind.
R15's "long-lived connection task" was listed as a hard prereq, but
the existing Mutex<NmxClient> already serialises concurrent ops
during the rebuild — recover_connection_core holds the inner mutex
during the swap, concurrent ops just wait. Functionally equivalent
to the long-lived-task design.
New ConfigError::RecoveryNotConfigured returned when
recover_connection is called without a factory installed. New
public re-export: RebuildFactory.
Tests (mxaccess 65 → 67):
- recover_connection_without_factory_returns_recovery_not_configured
- recover_connection_with_always_failing_factory_exhausts_attempts
(pins (Started, Failed)×3 + final will_retry=false + bubbled
TransportFailure)
- subscribe_populates_registry_unsubscribe_clears_it
- recovery_events_supports_multiple_subscribers (updated for the
new factory-required path)
connect_nmx_auto-side auto-population of the factory (capturing the
ntlm_factory + discovered (addr, service_ipid) so consumers don't
re-author the closure) is a future polish — not required to close
F16.
design/followups.md: F16 moved to Resolved.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
|
|
079896c7bc |
design/followups: collapse 18 redundant 'Earlier slices' blocks
Each F18 cumulative-log step had its own '**Earlier slices:**' header
followed by a verbose body that duplicated the matching commit
message — content already preserved in `git show <hash>` for every
hash listed in the cumulative-log line at the top of F18.
Removes ~75 lines of redundancy:
- 18× '**Earlier slices:**' headers and their bodies (F19, F20,
F21, F22, F24, F23, F25 steps 1-10, F26 steps 1-3, example
rewrite).
- The stale 'F25 (...) and F26 (...) remain open' paragraph (both
closed long since).
Keeps the substantive material in place:
- The cumulative-log line listing every commit by hash.
- The 5-finding F25 live-bring-up reconciliation block (justifies
F28 + F29 followups).
- The F26 step 3 AsbSession design rationale (explains why ASB
parallels rather than unifies with the NMX Session — useful for
future readers).
- A one-sentence pointer to `git show <hash>` for per-step detail.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
|
|
cfeb761092 |
[F33] mxaccess-asb: complete InvalidConnectionId tolerance propagation
rust / build / test / clippy / fmt (push) Has been cancelled
Closes F33. Final commit in the three-step F33 closure ( |
||
|
|
cbc95a4684 |
[F33] design/followups: capture live-subscribe wire gap
Live run of `cargo run -p mxaccess --example asb-subscribe` against
the local AVEVA install (with DH params + passphrase loaded from
Setup-LiveProbeEnv.ps1 + Get-AsbPassphrase.ps1) surfaced two concrete
gaps in the subscription-path response decoders:
1. `CreateSubscriptionResponse` returns subscription_id = 0 — the
server almost certainly assigns a real Int64, but
decode_create_subscription_response can't locate the
`<SubscriptionId>` element. Likely a dict-id our F30 post-pass
doesn't resolve for that specific element name.
2. `AddMonitoredItemsResponse` decode fails with MissingField
"Status". The wire shape needs a capture-and-diff vs the .NET
probe's subscription path.
Once subscribe-side ops are issued, the channel desyncs — subsequent
read() on the same session fails with the same MissingField error,
suggesting NBFX framing state may also be out of sync.
The F26 stream API itself (AsbSession::subscribe → Stream<Item =
Result<MonitoredItemValue, Error>>) is complete and unit-tested
(commit
|
||
|
|
f2f22dfcd1 |
[F26 stream] mxaccess: AsbSession::subscribe — Stream<Item = MonitoredItemValue>
rust / build / test / clippy / fmt (push) Has been cancelled
Closes the last F26 stub from the M5 status block. New
AsbSession::subscribe(subscription_id) returns an AsbSubscription
that impls Stream<Item = Result<MonitoredItemValue, Error>>. An
internal tokio::spawn'd publish-loop drains the subscription queue
via the existing AsbSession::publish() and fans each
PublishResponse's `values` array out as individual stream items.
Termination semantics:
- Drop of AsbSubscription calls JoinHandle::abort() — the publish
task stops draining the server-side queue (the .NET reference
pattern at MxAsbDataClient.cs uses the same task-cancellation
shape).
- Transport error from publish() is delivered as the final stream
item; the loop returns and the channel closes.
- Receiver-drop (consumer stops polling) is detected when
tx.send returns Err — the loop exits without making more
publish calls.
The inner publish_loop helper takes any FnMut() -> Future<Result<...>>
so it's testable in isolation (no live ASB endpoint required).
Per-item ItemStatus from the server is intentionally not surfaced
on the stream: the field is opaque per-item and rarely actionable
for the streaming consumer. A richer struct can wrap each value if
that need surfaces.
3 new tests pin:
- asb_subscription_is_stream_send_unpin (compile-time bounds);
- publish_loop_delivers_values_then_terminates_on_error
(3 Ok values from 2 batches, then 1 terminal Err);
- publish_loop_exits_when_consumer_drops_channel.
New deps used (already in mxaccess Cargo.toml): futures_util::Stream,
tokio::sync::mpsc, tokio_stream::wrappers::ReceiverStream,
tokio::task::JoinHandle.
Workspace: 718 → 721 tests. Default-feature clippy clean.
mxaccess crate-level doc updated to drop the "stubbed for next F26
iteration" note for the subscription stream.
design/followups.md F18 M5 status block updated: F26 stream
subscription marked resolved.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
|
|
8e695b9347 |
[F12 wrapper + F32 close] Session::connect_nmx_auto + close M5 type-matrix DoD
rust / build / test / clippy / fmt (push) Has been cancelled
Two related closures in one commit:
1. Session-level wrapper around F12: new
`mxaccess::Session::connect_nmx_auto(ntlm_factory, options,
resolver, recovery)` gated on a new `mxaccess/windows-com` feature
(which propagates `mxaccess-nmx/windows-com`). Drives
`NmxClient::create` (the F12 COM-activation factory) for the
`(host, port, service_ipid)` discovery, then funnels into the
shared post-NMX-bind orchestration. Refactored `connect_nmx` to
extract steps 1+2+4+5 into a private `from_nmx_client` helper —
both `connect_nmx` and `connect_nmx_auto` reuse it so the
`CallbackExporter` + router + `RegisterEngine2` + heartbeat policy
stays in one place. The .NET `MxNativeSession.Open` shape
(`MxNativeSession.cs:127-147`) is now reproduced end-to-end on
Windows with `windows-com` on — callers no longer pre-resolve
`(addr, service_ipid)` by hand.
`connect_nmx`'s doc comment updated to drop the stale "F12 not yet
wired" note. `parse_bracketed_host_port` in mxaccess-nmx gets a
`cfg_attr(not(...), allow(dead_code))` so the default-feature
build stays warning-clean.
2. F32 closed via option (b) of its own resolve criterion: the four
missing types (Float / Double / DateTime / Duration) are gated on
Galaxy-side template provisioning that's outside the Rust port's
scope. The deployed test Galaxy on this host only has
mx_data_type ∈ {1=Bool, 2=Int32, 5=String}; we cannot exercise
the missing types without authoring new template attributes in
the Aveva console (a manual platform-engineering task). The
three-type live verification at commit
|
||
|
|
daa4ea3f16 |
[F12] mxaccess-nmx: NmxClient::create — auto-resolving COM-activation factory
rust / build / test / clippy / fmt (push) Has been cancelled
New constructor NmxClient::create(ntlm_factory) gated on
cfg(all(windows, feature = "windows-com")). New crate feature
mxaccess-nmx/windows-com propagates to mxaccess-rpc/windows-com.
Mirrors ManagedNmxService2Client.Create() (cs:30-64) plus
ResolveService (cs:491-523).
Six-step bring-up:
1. com_objref_provider::marshal_activated_iunknown_objref(
"NmxSvc.NmxService", MarshalContext::DifferentMachine)
activates and emits the OBJREF.
2. ComObjRef::parse extracts oxid + the activated server's IUnknown
IPID.
3. resolve_oxid_with_managed_ntlm_packet_integrity against
127.0.0.1:135 (RPCSS endpoint mapper) returns the server's
(host, port) bindings + IRemUnknown IPID.
4. parse_bracketed_host_port pulls the host + port out of the
ncacn_ip_tcp binding's `host[port]` text. Uses rfind for the
rightmost brackets so FQDN forms (foo.example.com[1234])
round-trip — matches the .NET ParseBracketedHost/Port shape at
cs:540-561.
5. A fresh DceRpcTcpClient binds to IRemUnknown and calls
RemQueryInterface(iunknown_ipid, INmxService2_IID,
fresh_causality_id, public_refs=5).
6. A second fresh transport binds to INmxService2 via Self::connect.
The ntlm_factory: impl FnMut() -> NtlmClientContext closure is
invoked three times (one per bind); each NtlmClientContext is
consumed by its bind, so the factory must produce fresh contexts.
New NmxClientError variants:
- Activation(ProviderError) — only emitted with windows-com on.
- EndpointResolution { reason } — covers no ncacn_ip_tcp binding,
malformed host[port], non-zero RemQueryInterface HRESULT.
6 offline tests on parse_bracketed_host_port: FQDN host extraction,
rfind for rightmost brackets, rejection of missing '[' / missing
']' / non-numeric port / port overflow.
1 live test (#[ignore], gated on MX_LIVE + MX_TEST_USER /
MX_TEST_PASSWORD / MX_TEST_DOMAIN populated by
tools/Setup-LiveProbeEnv.ps1): round-trips the full chain against
the AVEVA install on this host. Resolved INmxService2 IPID is
non-zero — verified end-to-end.
Workspace: mxaccess-nmx 17 → 23 (+6). All other crates unchanged.
Closes F12 in design/followups.md. F6 (ComObjRefProvider port) was
the prior blocker; with both landed, the COM-activation path is
end-to-end functional.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
|
|
cf9dbaf568 |
[F6] mxaccess-rpc: ComObjRefProvider port via windows-rs (CoMarshalInterface)
rust / build / test / clippy / fmt (push) Has been cancelled
New module crates/mxaccess-rpc/src/com_objref_provider.rs gated on cfg(all(windows, feature = "windows-com")). Pulls windows = "0.59" (features Win32_Foundation + Win32_System_Com + Win32_System_Com_Marshal + Win32_System_Com_StructuredStorage + Win32_System_Memory) as an optional dep behind the existing windows-com feature; default footprint stays slim. Public API mirrors ComObjRefProvider.cs 1:1: MarshalContext enum (InProcess / Local / DifferentMachine wrapping the MSHCTX_* newtype constants), clsid_from_prog_id, marshal_activated_iunknown_objref (activates via CoCreateInstance with INPROC | LOCAL | REMOTE then marshals), marshal_iunknown_objref (uses IUnknown::IID), marshal_interface_objref (CoMarshalInterface over an HGlobal-backed IStream). All `unsafe` is internal to the module — public API exposes only typed Rust values (Vec<u8>, GUID, ProviderError), no raw pointers / HRESULTs / lifetime-bound interface pointers leak. Each unsafe block carries an inline SAFETY comment naming the invariants being upheld. Per-thread COM init via thread-local OnceLock<()>: lazy CoInitializeEx(MULTITHREADED) on first call; S_FALSE (already initialised) and RPC_E_CHANGED_MODE (thread is STA) treated as success — matches the .NET runtime's tolerant apartment behaviour. ProviderError enumerates the four documented failure modes plus the apartment-init pre-check: UnknownProgId / ActivationFailed / MarshalFailed / GlobalLockFailed / ApartmentInitFailed. 4 offline tests: MarshalContext → MSHCTX_* mapping, ensure_apartment idempotence, clsid_from_prog_id returns UnknownProgId for fake ProgIDs, marshal_activated short-circuits at the resolution stage. 1 live test (#[ignore], gated on MX_LIVE): activates the real NmxSvc.NmxService, marshals the proxy's IUnknown via CoMarshalInterface, then parses the resulting blob via ComObjRef::parse and asserts non-zero OXID + IPID. Passes against the AVEVA install on this host. Workspace tests: mxaccess-rpc went 179 → 183 (+4). All other crates unchanged. Unblocks F12 (NmxClient::create — the auto-resolving COM-activation factory): the underlying primitive (marshal_activated_iunknown_objref) now exists; remaining work is threading the windows-com feature through mxaccess-nmx and chaining ComObjRef::parse → resolve_oxid_with_managed_ntlm_packet_integrity → RemQueryInterface. design/followups.md F12 updated with a revised "Resolves when" reflecting that F6's blocker is gone. Closes F6 in design/followups.md. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
41f2d4c0f2 |
[F14] mxaccess-galaxy: tiberius-backed SQL Resolver + UserResolver
rust / build / test / clippy / fmt (push) Has been cancelled
New module crates/mxaccess-galaxy/src/sql_resolver.rs (~480 LoC) gated behind the existing galaxy-resolver Cargo feature. Adds SqlTagResolver + SqlUserResolver, both constructed via from_ado_string(&str) accepting the same connection-string shape the .NET reference uses by default (Server=localhost;Database=ZB;Integrated Security=True; Encrypt=False;TrustServerCertificate=True). Integrated Security=True resolves to Windows auth via tiberius's winauth feature. Each top-level call (resolve / browse / resolve_by_guid / resolve_by_name) opens a fresh Client<Compat<TcpStream>> and drops it on return — matches the .NET `await using` lifecycle at GalaxyRepositoryTagResolver.cs:93-95. tiberius's Client::query only accepts positional @P1..@PN placeholders (delegates to sp_executesql); the canonical RESOLVE_SQL / BROWSE_SQL / USER_BY_GUID_SQL / USER_BY_NAME_SQL constants are rewritten once-per-process via OnceLock<String> (@objectTagName → @P1, etc.). The unrewritten constants stay byte-identical with the .NET reference for ad-hoc diagnostic copy/paste. read_metadata mirrors ReadMetadata (cs:149-165) byte-by-byte: signed smallint → i16 widened to u16 for platform/engine/object IDs (matches the .NET checked((ushort)reader.GetInt16(N)) shape), int → i32 checked-cast to i16 for property_id, nullable nvarchar for primitive_name. read_user_profile mirrors ReadProfile (cs:76-85) including the roles_text blob → parse_role_blob round-trip. Deps added (gated): tiberius 0.12 (default-features = false; tds73 + rustls + winauth — no chrono / rust_decimal pulled), tokio-util's compat feature for the futures-rs ↔ tokio AsyncRead bridge, futures-util for TryStreamExt::try_next. Default-feature build still pulls only mxaccess-codec + async-trait + thiserror + uuid (slim foot-print preserved per the design doc's intent). New `live` feature on this crate (`live = ["galaxy-resolver"]`) for parity with the workspace pattern. 11 offline unit tests pin: SQL named→positional rewriting (no @named left, @P1/@P2/@P3 present), line-count preserved, ado-string acceptance (default Galaxy shape parses, garbage rejected), input validation (max_rows=0 rejected, empty LIKE rejected, empty user_name rejected, all checked before connect attempt). Two #[cfg(feature = "live")] #[ignore]'d tests round-trip against a real Galaxy DB (gated on MX_LIVE + MX_GALAXY_DB env vars per tools/Setup-LiveProbeEnv.ps1). Live verification on this host: live_resolve_test_child_object_test_int and live_browse_test_child_object both pass against the local AVEVA install — TestChildObject.TestInt resolves with mx_data_type=2 (Int32), is_array=false. Closes F14 in design/followups.md. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
9501080170 |
[F4+F5] mxaccess-rpc: BindAck/AlterContextResponse parser + live-capture round-trip
rust / build / test / clippy / fmt (push) Has been cancelled
Adds BindAckPdu + per-result BindAckResult per [C706] §12.6.3.4: u16 result + u16 reason + 20-byte SyntaxId, preceded by port_any_t secondary address, n_results, and 3 reserved bytes. Encode/decode handle both PacketType::BindAck and PacketType::AlterContextResponse (same body shape). The new bind_ack_round_trips_live_capture test takes the first 84 bytes of the server's first response in captures/013-loopback-subscribe-scalars/tcp-stream-__1_49704-to-__1_55690.bin (real BindAck observed against local AVEVA), decodes the shape (secondary address "49704\0", n_results=2, NDR transfer syntax accepted, DCOM negotiate_ack reason=3), then re-encodes and asserts byte-identical to the original frame. Stronger evidence of wire parity than the prior synthetic-frame tests, and lets the rest of the M2 stack reason about the negotiated transfer syntax instead of relying on request-side context to infer it. Closes F4 and F5 in design/followups.md. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
826f7b3f89 |
[M5] mxaccess-asb-nettcp: F29 resolved — full canonical [MC-NBFS] table port
rust / build / test / clippy / fmt (push) Has been cancelled
The original hand-curated table was wrong starting at id 74 — entries had been deduplicated/renumbered without preserving the canonical `id = 2 * StringN` mapping from `[MC-NBFS]` §2.2, leaving most of the SOAP-fault subset at the wrong ids: ours had Fault at 114, canonical is 134 ours had Code at 122, canonical is 142 ours had Reason at 124, canonical is 144 ours had Text at 126, canonical is 146 ours had Value at 134, canonical is 154 ours had Subcode at 136, canonical is 156 Wire captures from the live AVEVA MxDataProvider use the canonical ids — verified earlier via `MX_ASB_TRACE_REPLY` showing `<resultCodeField>` correctly resolved through the F30 post-pass once the ids matched. Replaced the entire STATIC_ENTRIES array with a faithful port of the first 200 entries from `dotnet/wcf`'s `src/System.ServiceModel.Primitives/src/System/ServiceModel/ ServiceModelStringsVersion1.cs` (sourced via WebFetch — that file is the canonical [MC-NBFS] §2.2 table mirrored in code). The wire id is `2 * StringN` for `StringN` at 0-based position N. Coverage now spans id 0..400, picking up the full SOAP / WS-Addressing / WS-RM / WS-Security / WS-SecureConversation / WS-Trust / xmldsig+xenc URIs / SAML / Kerberos / X509 token-type subset. The 436..444 xsi/xsd/nil extras (used by .NET XmlSerializer for [MessageContract] value-type bodies) are preserved. Four new regression tests: - ids monotonic (was already there); - ids all even (`[MC-NBFS]` reserves odd ids for the dynamic dict); - SOAP-fault subset (s, Fault, MustUnderstand, Code, Reason, Text, Node, Role, Detail, Value, Subcode) resolves to the canonical strings — pins the fix against accidental regression; - `position_of_static` round-trips for known strings. Followups: - F29 moved to ## Resolved with full audit-trail. - F18 M5 status block updated to strike F29 from the remaining-work list. The remaining open M5 items are F32 (live type-matrix beyond Int32/String/Bool, gated on Galaxy provisioning) and F28 (canonical XML signing for Read/Write/Subscribe ops, P2 latent). Workspace: 712 unit tests pass (was 711 + 1 new fault-subset test + existing tests now matching canonical). Clippy clean. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
5845b5eb12 |
[M5] mxaccess-asb: F32 partial — Bool + String + Int32 live, longer retry budget
rust / build / test / clippy / fmt (push) Has been cancelled
Three of seven proven types now round-trip end-to-end against the live MxDataProvider: ✅ Int32 (type_id 4) — TestChildObject.TestInt = 99 ✅ String (type_id 10) — TestChildObject.TestString = "mxaccesscli verified 17778523775" (UTF-16LE on wire) ✅ Bool (type_id 17) — DelmiaReceiver_001.TestAttribute = 0 A SQL probe of the live Galaxy (`gobject ⨝ package ⨝ dynamic_attribute` grouped by `mx_data_type`) shows only types {1=Bool, 2=Int32, 5=String} have deployed instances. Float/Double/DateTime/ Duration/array shapes are not in this Galaxy, so the remaining four type-matrix bullets in F32 are gated on Galaxy-side provisioning that's outside the Rust port's scope. The M5 DoD #3 was always going to bottom out at "what types are deployed in the test environment." Code changes: - `register_items` retry budget bumped: 10 attempts (was 5) with `200 * attempt` ms backoff (was 100 * attempt). Worst-case wait ~11 s, well within user-perceived latency on a one-shot RPC. The .NET reference's 5×100 ms didn't always cover the live AVEVA install's auth-state-commit latency on this hardware. - `AsbClient::connect` adds a 250 ms `tokio::time::sleep` immediately after the one-way `AuthenticateMe` send. The server processes the request asynchronously; without an initial settle, the per-op retry loop frequently exhausts its budget on the InvalidConnectionId race even on the FIRST register attempt. 250 ms is short enough to be invisible and long enough to absorb the typical commit delay. - `examples/asb-subscribe.rs` now prints `result_code` and `success` alongside the status count so the user can see when register is hitting the retry-exhausted state. Live flakiness note: the AuthenticateMe race is not fully deterministic — after many back-to-back test runs the live server appears to degrade (presumably pending-connection table fills) and the retry budget exhausts on EVERY tag, not just one. A 30-second cool-down restores reliability. Production deployments with a single long-lived session are unlikely to hit this. F32 status doc captures the observation. Workspace: 711 unit tests pass. Clippy clean. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
4ddb6542e1 |
[M5] design: followups update — M5 functionally LIVE, F30/F31 resolved
F18 (M5 master) gains an "M5 STATUS" block right after the DoD checklist showing the live end-to-end win (commit `9063f10`, TestChildObject.TestInt round-trips with payload [99,0,0,0]) and ticking each DoD bullet: - ✅ Live `asb-subscribe` succeeds. - ⚠️ Wire request bytes match .NET byte-for-byte; response parity uses the F30 dict-id resolution post-pass + chunked-Bytes concatenation instead of strict byte equality (functionally equivalent — both decode to the same logical XML). - ⚠️ Type matrix: only Int32 verified live; Bool/Float/Double/ String/DateTime/Duration/arrays pending sample tags. Tracked under new F32. - ✅ build/test/clippy green (711 tests). Followup churn: - F30 + F31 moved to ## Resolved with proper "Resolved: <date> (commit `<hash>`)" headers. F30 was the unblocker for F31 — without read-side dict-id resolution we couldn't see `<resultCodeField>1</>` in the response. - F28 status header updated to "PARTIALLY RESOLVED": the five [XmlSerializerFormat] ops (AuthenticateMe, Disconnect, KeepAlive, RegisterItems, UnregisterItems) plus DH params + dynamic-dict management all landed; Read/Write/Subscribe/Publish still sign over NBFX wire bytes via the legacy fallback. Severity demoted P0 → P2 because the live registry has empty `hashAlgorithm` and unsigned ops work in practice; promote back if that changes. - F29 reaffirmed P2 (latent NBFS dict-id drift, no live impact). - New F32 captures the type-matrix expansion as the only remaining P1 item for full M5 closeout. No code change in this commit — design doc only. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
9063f10b1b |
[M5] mxaccess-asb: register_items retry on InvalidConnectionId — LIVE PATH WORKS
rust / build / test / clippy / fmt (push) Has been cancelled
End-to-end live path now functional: Connect → AuthenticateMe →
RegisterItems → Read → Disconnect. The example reads back the live
TestChildObject.TestInt value (99) over the wire on the first run.
Root-cause of the previous "InvalidConnectionId" mystery: it was
never an HMAC verification failure. `AuthenticateMe` is one-way
(`AsbContracts.cs:18`) and the server commits auth state
asynchronously after the request lands. A Register that follows too
quickly sees the connection in pre-authenticated state and returns
`AsbErrorCode.InvalidConnectionId` (= 1).
.NET's `MxAsbDataClient.RegisterMany` (`cs:191-204`) handles this
explicitly with a retry loop:
for (int attempt = 1; attempt < 5
&& response.Result.ErrorCode == InvalidConnectionId; attempt++)
{
Thread.Sleep(TimeSpan.FromMilliseconds(100 * attempt));
response = RegisterOnce(items);
}
We now mirror the same pattern in `AsbClient::register_items_once`
followed by a retry loop in `register_items` — up to 5 attempts with
`100 * attempt` ms backoff.
Supporting changes:
- `RegisterItemsResponse` gains `result_code: Option<u32>` +
`success: Option<bool>` so callers can read `Result.resultCodeField`
+ `successField` from the response. `decode_register_items_response`
now tolerates an empty `<ASBIData />` Status array (server returns
empty when the operation fails server-side) instead of erroring
with `MissingField`. New helper `find_text_in_named_element` walks
the body token stream.
- New public constant `RESULT_CODE_INVALID_CONNECTION_ID = 1` for
callers that want to detect this status outside the retry path.
- The previously-failing test `decode_register_items_response_returns_
missing_field_when_status_absent` was renamed and rewritten as
`decode_register_items_response_returns_empty_status_when_absent`
to match the new tolerant decode contract.
F31 closed. F30 (read-side dict-id resolution, landed in `eb6c689`)
was the unblocker — without it we couldn't see the
`<resultCodeField>1</>` element in the response and the failure mode
looked like a HMAC mismatch instead of a transient retryable error.
Workspace: 711 unit tests pass. Clippy clean.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
|
|
eb6c689f09 |
[M5] mxaccess-asb: F30 read-side dict-id resolution + matching .NET CV xmlns
**F30 (read side):** post-pass over `body_tokens` in `decode_envelope` substitutes `NbfxName::Static(id)` → `NbfxName::Inline(name)` and `NbfxText::DictionaryStatic(id)` → `NbfxText::Chars(name)` whenever the dict id resolves. Lookup tries the per-message binary header strings first (`(id-1)/2` slot), then falls back to the cumulative session dynamic dict, then the `[MC-NBFS]` static table for even ids. Tokens with unresolvable ids stay opaque so trace output still reveals them. This unblocks reading the live Register response: previously every field came back as `<b:Static(43)>false</…>` and we couldn't tell what the server actually said. Now we see `<b:successField>false</>` and `<b:resultCodeField>1</>` clearly. resultCode 1 maps to `AsbErrorCode.InvalidConnectionId` (`AsbResultMapping.cs:6`) — which means AuthenticateMe failed silently and the server discarded our connection state, even though the crypto stack is proven byte-equal to .NET. **Wire CV xmlns parity:** `<h:ConnectionValidator>` for the `XmlSerializer` mode (AuthenticateMe / Disconnect / KeepAlive) now emits all four xmlns declarations .NET writes, in the same order: `xmlns:h`, default `xmlns` (same value), `xmlns:xsi`, `xmlns:xsd`. .NET emits the default xmlns redundantly even though the `h` prefix is bound to the same URL — captured against the .NET probe via asb-relay. This was suspected to be the AuthenticateMe HMAC blocker but the live test still returns `InvalidConnectionId`, so the bug is elsewhere. **F31 updated** with the surviving hypotheses for the `InvalidConnectionId` mystery: server-side `XmlSerializer` constructor mismatch, subtle byte-level wire difference affecting deserialization, or unused `ServiceAuthenticationData` from the ConnectResponse. Resolution probably requires server-side instrumentation or controlled-scenario byte-level HMAC diff. Workspace: 710 unit tests pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
703c540bdc |
[M5] mxaccess-asb: MX_ASB_TRACE_REPLY trace + F30/F31 followups
Adds env-gated diagnostic trace `MX_ASB_TRACE_REPLY` that, on every incoming SizedEnvelope, prints the raw reply bytes + decoded body tokens (capped at 64) before any consumer-level decode runs. Used to isolate the next blocker after F28's wire-format fixes landed: with canonical XML signing, registry-driven DH params, dynamic-dict id management, ConnectionValidator wire-format-per-action, chunked ASBIData decode, and 0x0A `ShortDictionaryXmlnsAttribute` all in place, AuthenticateMe is accepted by the server and a real RegisterItemsResponse comes back — but it decodes to an opaque token stream of `<b:Static(43)>false</b:Static(43)>` etc. because every field name is dict-encoded against the response's own binary header pre-pop and we never resolve those ids on the read side. Two new follow-ups capture the remaining work: - **F30**: resolve dict-id element/attribute names on the read side. Mirror the F28 write-side fix: read-side dynamic dict accumulates session strings via `intern`, and `decode_tokens` (or a post-pass) needs to substitute `NbfxName::Static(id)` with the resolved `NbfxName::Inline(name)` so downstream `find_element_named` / `collect_asbidata_payloads` match. - **F31**: server response indicates `successField=false` with an empty Status array on Register. Hypotheses (in order): (a) silent HMAC mismatch despite F23 deterministic parity; (b) request-side wire-byte delta the server tolerates but interprets as 0 items; (c) tag does not resolve in the live Galaxy state. Resolution needs F30 first to read the actual Status array + error codes. Workspace: 710 unit tests pass. Clippy clean. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
ce27b63010 |
[M5] auth: deterministic HMAC fixture test rules out crypto stack
Adds end-to-end byte-equality test against a `.NET reference fixture captured via the new `MxAsbClient.Probe --dump-deterministic-hmac` flag. All inputs are pinned (passphrase, prime, generator, private- key bytes, remote-pub bytes, message number, connection ID, AES IV, consumer-data + IV bytes), so the test reproduces .NET's exact output for every crypto step: 1. shared = remote_pub^private_key mod prime — ✅ matches 2. crypto_key = shared || passphrase_utf8 — ✅ matches 3. hmac = HMAC-SHA1(crypto_key, xml_utf8) — ✅ matches 4. aes_key = PBKDF2-SHA1(base64(crypto_key), salt, 1000, 16) — ✅ 5. encrypted_mac = AES-CBC(aes_key, iv=zeros, hmac, PKCS7) — ✅ This conclusively rules out the entire crypto stack as the source of the live AuthenticateMe `dispatcher/fault`. Our DH math, HMAC engine, PBKDF2 derivation, AES-CBC PKCS7, and crypto_key concatenation are byte-equal to .NET. The remaining live failure must come from one of: (a) wire-level ConnectionValidator NBFX shape (DataContract field names, mustUnderstand attribute, namespace), (b) WCF binary message header (action+to dict pre-pop), or (c) a subtle XmlSerializer quirk for live values that the hardcoded fixtures don't exercise (Guid format edge case, base64 line wrapping, ulong text rendering). Fixture lives at `crates/mxaccess-asb-nettcp/tests/fixtures/ deterministic-hmac/authenticate-me.kv` (KV format, ASCII hex, lines trim CRLF/LF transparently). The companion `README.md` documents the capture procedure and the per-step decomposition. The test consumes the .NET-supplied canonical XML directly from the fixture's `xml_utf8_b64` so a Rust XML emitter bug would not mask a Rust crypto bug — XML byte-equality is verified separately by `mxaccess-asb::xml_canonical::tests` against the `signed-xml/*.xml` fixtures. Workspace: 710 unit tests pass (was 709 + 1 new). Clippy clean. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
42ac10a88f |
[M5] design: F28 follow-up update with progress + remaining blocker
Updates F28 with: - Captured-fixtures section now lists all 6 shapes (added the empty-MAC variant) and 10 inferred XmlSerializer rules including the empty-byte-array → self-closing-element rule we discovered. - New "Emitter landed" section pointing at commit `f14580e` and the five exposed `emit_*` functions, plus the `AsbAuthenticator::peek_next_message_number` plumbing. - New "Registry-driven DH params" section explaining why `CryptoParameters::defaults()` was insufficient for live testing (live AVEVA installs use 768-bit primes; default is 1024-bit) and documenting the new MX_ASB_DH_* env-var contract. - New "Remaining live blocker" section documenting that AuthenticateMe still faults despite canonical XML byte-equality and registry-correct DH params — most likely a byte-level HMAC/AES discrepancy that needs a deterministic-input unit-test triple to pin down. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
dbb580b2c8 |
[M5] tools+fixtures: F28 canonical-XML signing target captured from .NET
Adds `MxAsbClient.Probe --dump-signed-xml` flag that builds five ConnectedRequest shapes (AuthenticateMe, Disconnect, KeepAlive, RegisterItemsRequest, UnregisterItemsRequest) with deterministic field values and prints `AsbSerialization.ToXml(...)` output. The output is exactly what `AsbSystemAuthenticator.Sign` HMACs (`AsbSystemAuthenticator.cs:79`), so the Rust port's canonical-XML emitter must produce byte-identical bytes for HMAC parity. Captured fixtures land under `rust/crates/mxaccess-asb/tests/fixtures/signed-xml/`: - `authenticate-me.xml` — 1000 bytes - `disconnect.xml` — 980 bytes - `keep-alive.xml` — 705 bytes - `register-items.xml` — 1068 bytes - `unregister-items.xml` — 1072 bytes Plus a `README.md` documenting 10 inferred XmlSerializer rules (element name = class name not WrapperName, field order = declaration order not [MessageBodyMember.Order], `[XmlType.Namespace]` on field type causes per-child xmlns redeclaration on the children not the wrapper, `*Specified` pattern controls Xxx emission, CRLF + 2-space indent + utf-16 declaration but UTF-8 bytes fed to HMAC). `.gitattributes` marks the XML fixtures as binary (`*.xml -text`) so neither `core.autocrlf` nor `text` filters can rewrite the byte content — CRLF is part of the canonical form and must survive round-trip through Git untouched. `MxAsbClient.csproj` gains `<InternalsVisibleTo Include="MxAsbClient .Probe" />` so the probe can reach the internal `AsbSerialization` helper without making it public. Workspace: 702 tests pass (no Rust changes — fixtures only). F28 follow-up updated with the captured fixtures + the inferred rules. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
d1e887b91b |
[M5] mxaccess-asb-nettcp/asb: Connect handshake live + SOAP fault detection
Live-bring-up reconciliation against AVEVA's MxDataProvider on Windows.
Connect now completes end-to-end (real DH key exchange, apollo:V2
encryption, ServicePublicKey/ServiceAuthenticationData populated). Five
fixes land:
1. NBFX `PrefixElement_a..z` (0x5E-0x77) and `PrefixAttribute_a..z`
(0x26-0x3F) decode + encode arms. The server's ConnectResponse hit
`0x65 = PrefixElement_h` for a dynamically-named element and our
decoder bailed with `unknown NBFX record byte 0x65`. Both directions
now round-trip; encoder picks short-form when prefix is a single
lowercase ASCII letter.
2. xmlns redeclaration on `<Data>` AND `<InitializationVector>` inside
`AuthenticationData` / `PublicKey`. `[XmlType(Namespace = ...)]` on
AuthenticationData / PublicKey (`AsbContracts.cs:350-381`) means
XmlSerializer emits `xmlns="..."` on each direct child. The default-
ns scope ends at `</Data>`, so `<InitializationVector>` needs its own
redeclaration to stay in the data namespace; without it the server
fell back to messages-namespace and the deserialiser threw an
`InternalServiceFault`.
3. SOAP-fault detection in `AsbClient::send_envelope`. New
`ClientError::SoapFault { action, code, reason }` surfaces when the
response Action header matches the canonical `dispatcher/fault`
template; previously body decoders blindly ran and surfaced
`MissingField { field: "Status" }` masking the actual fault. Reason
text is extracted as the longest `NbfxText::Chars` in the body —
robust against the `nbfs.rs` static-dictionary id mismatches.
4. Identified blocker (filed as F28): signed-request HMAC currently
covers the NBFX wire bytes, but .NET's `AsbSystemAuthenticator.Sign`
HMACs `Encoding.UTF8.GetBytes(request.ToXml())` — the canonical XML
serialisation via `XmlSerializer` with namespace
`urn:invensys.schemas` (`AsbSerialization.cs:12-48`). Until the Rust
port emits identical XML bytes for `ConnectedRequest` subclasses,
AuthenticateMe / RegisterItems / every signed RPC fault on the
server. Connect itself is unsigned (`ServiceMessage` not
`ConnectedRequest`) which is why it works today.
5. Identified `nbfs.rs` static-dictionary id drift (filed as F29): wire
uses Fault=134/Code=142/Reason=144/Text=146/Value=154/Subcode=156
but our table has them at 114/122/124/126/134/136. Off by 20 from
id 114+ — 10 missing entries between `s` (id 112) and `Fault`. No
request-side impact (we only encode IDs ≤44, all correct); the SOAP
fault decode walks text records directly so it sidesteps the issue.
Workspace: 702 tests pass (no test count delta — wire-only fixes).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
|
|
e3baeb8803 |
[M5] mxaccess: F26 step 3 — AsbSession high-level cheap-clone async API
Adds the public high-level entry point for the ASB transport.
Parallel to the NMX-shaped `Session` (rather than unified) because
NMX's `Session` carries CallbackExporter / callback router task /
recovery broadcast / INmxService2 mutex orchestration that has no
ASB analogue — and ASB's request/response loop over a single TCP
stream maps naturally to `Mutex<AsbClient>` that would be foreign
to NMX. Two paths converge at the consumer-facing API but stay
distinct at the orchestration layer.
Struct shape:
```rust
pub struct AsbSession { inner: Arc<AsbSessionInner> }
struct AsbSessionInner {
transport: Mutex<AsbTransport<TcpStream>>,
connect_response: ConnectResponse,
}
```
`Clone + Send + Sync` — clones share state through `Arc`, lock
serialises operations. Compile-time `assert_clone_send_sync` test
guards the contract.
API:
* `connect(endpoint, passphrase, crypto_parameters, via_uri,
connection_id)` — full bring-up (TCP + preamble + DH handshake).
* `from_transport(transport, connect_response)` — build from an
existing transport (tests, custom transports).
* `connect_response()` — surface the negotiated lifetime /
Apollo flag.
Operation methods forward to AsbClient:
* `register_items` / `unregister_items` / `read` / `write`
* `keep_alive` / `disconnect`
* `create_subscription` / `add_monitored_items` / `publish` /
`delete_monitored_items` / `delete_subscription`
* `publish_write_complete`
ClientError → mxaccess::Error mapping via
`ConnectionError::TransportFailure` (consistent with F26 step 2).
1 new test:
* `asb_session_is_clone_send_sync` — compile-time trait-bound
assertion.
Workspace: 702 tests pass.
Stubbed for next F26 iteration:
* `Stream<Item = MonitoredItemValue>` subscription handle that
internally drives a publish-loop. Today consumers loop
`publish().await` themselves.
* Recovery / reconnect policy — needs a captured ASB-side
disconnect to inform the retry strategy.
* Live-probe wire-byte reconciliation against the WCF DataContract
XML serializer's actual output.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|