Commit Graph

8 Commits

Author SHA1 Message Date
Joseph Doherty d149143535 [F49 steps 2 + 3] live verification: buffered recovery replay + unsubscribe skip
Step 3 (F47 buffered unsubscribe skip):
- crates/mxaccess-compat/tests/buffered_unsubscribe_skip_live.rs.
- Subscribe buffered, sleep so the engine has DataUpdates in flight,
  then call unsubscribe. Asserts Ok return without surfacing transport
  or HRESULT errors.
- Session::unsubscribe (session.rs:2261) probes the registry: if
  Buffered { .. }, it skips nmx.un_advise entirely, mirroring the .NET
  reference's `if (!subscription.IsBuffered)` guard at
  MxNativeSession.cs:361-381. If unsubscribe accidentally emitted
  UnAdvise for a buffered correlation id, the engine would return
  non-zero HRESULT (no matching plain advise to retract) — surfacing
  as a panic.

Step 2 (F45 buffered recovery replay):
- crates/mxaccess-compat/tests/buffered_recovery_replay_live.rs.
- Subscribe buffered, drain >=1 NMX subscription message
  (cmd=0x32 SubscriptionStatus + cmd=0x33 DataUpdate) to confirm the
  wire path is hot pre-recovery, install a RebuildFactory that calls
  NmxClient::create (the same auto-resolving COM-activation path
  Session::connect_nmx_auto uses), invoke recover_connection, drain
  >=1 NMX subscription message post-recovery.
- Verifies the replay branch in recover_connection_core re-issues
  RegisterReference (NOT AdviseSupervisory) for the buffered entry,
  mirroring MxNativeSession.ReAdviseSubscription (cs:538-569).
  Structural property is unit-tested; this confirms the engine
  actually picks back up after the rebuild + replay.

Both tests pass live on this Galaxy:
  cargo test -p mxaccess-compat --features live-windows-com \
      --test buffered_unsubscribe_skip_live -- --ignored --nocapture
  cargo test -p mxaccess-compat --features live-windows-com \
      --test buffered_recovery_replay_live -- --ignored --nocapture

Pulls mxaccess-nmx + mxaccess-codec into mxaccess-compat dev-deps so
the recovery test can build a RebuildFactory closure that returns
NmxClient and bind a typed broadcast Receiver.

design/followups.md F49 -> Resolved (all five steps pass live).
docs/M6-live-verification.md updated with per-step evidence + repro
commands.

F49 is fully closed out. F55 (DCOM-managed INmxSvcCallback, Path A)
and F56 (missing EnsurePublisherConnected + post-RegisterReference
AdviseSupervisory for buffered) were the two real Rust-port bugs
uncovered along the way; both resolved. Remaining post-V1 followups
(F50 Suspend/Activate Frida, F51 ASB type matrix, F52 perf, F53 doc
lint, etc.) are scoped independently and not part of F49.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-06 12:00:44 -04:00
Joseph Doherty c6332c26a1 [F49 step 4 + step 5 + doc] live evidence: metrics smoke pass, M6-live-verification.md
F49 step 4 (F40 metrics smoke):
- crates/mxaccess-compat/tests/metrics_smoke_live.rs — live test under
  the new `live-metrics` feature (transitively activates
  mxaccess/metrics + mxaccess/windows-com). Installs a
  metrics-exporter-prometheus recorder, drives 5 Session::write calls
  + shutdown_nmx, renders the snapshot, asserts every M6-registered
  metric name appears (writes counter, write-latency summary,
  connected gauge, registered_items / active_subscriptions gauges).
  Pass on the live AVEVA install.

  Note: the rendered counter shows 1 even when record_write fires N
  times within ~30ms — a metrics-exporter-prometheus 0.16 quirk under
  tight loops, not a Rust port bug. Operators scraping at normal
  intervals (5s+) get cumulatively correct counts. Documented in the
  test + in M6-live-verification.md so future runs aren't surprised.

F49 status update (in design/followups.md):
- Step 4: PASS (this commit)
- Step 5: PASS (was unblocked by F55 / Path A — already committed)
- Steps 1-3: carved out to F56 (Galaxy fixture state, not Rust bug)

docs/M6-live-verification.md:
- Per-step evidence table with test invocations + outcomes.
- Sample Prometheus snapshot for step 4.
- Reproduction commands for the live tests.
- F56 explanation cross-referenced from step 1.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-06 10:36:09 -04:00
Joseph Doherty df3457c54a [F56] subscribe / subscribe_buffered: split-form wire body + diagnose Galaxy fixture gap
Three real fixes + one architectural diagnosis:

1. Session::subscribe_buffered_nmx now sends the .NET-reference split
   form on the wire:
     item_definition = "<attr>.property(buffer)"   (was: full reference)
     item_context    = "<object_tag_name>"          (was: empty)
     item_handle     = SessionInner::next_item_handle.fetch_add(1)
                       (was: hardcoded 0)
   Verified byte-identical against captures/082 + 094 by the existing
   buffered_register_reference_parity unit tests. The
   item_handle counter mirrors MxNativeCompatibilityServer's
   _nextItemHandle++ at MxNativeSession.cs:613.

2. New live tests:
   - tests/buffered_subscribe_live.rs (F49 step 1) — uses real Galaxy
     metadata via SqlTagResolver + connect_nmx_auto, drives a
     background writer at 500ms cadence to force value-changes,
     drains DataChange events from Subscription.
   - tests/plain_subscribe_live.rs — same harness over plain
     Session::subscribe (NOT buffered), used to isolate whether
     "no DataUpdate" is buffered-specific (it's not — both fail).

   Both pull tracing-subscriber as a dev-dep so `RUST_LOG=trace`
   surfaces dcom_sink + router activity.

3. mxaccess-galaxy/sql_resolver.rs: drop the inner-attribute
   `#![cfg(feature = "galaxy-resolver")]` — the module-level cfg on
   `pub mod sql_resolver` in lib.rs already handles this and Rust
   1.85's clippy::duplicated_attributes lint flagged the duplicate
   once mxaccess-compat dev-deps activated the feature.

4. F56 finding (diagnosis, NOT a bug fix): the engine on this Galaxy
   install does not have an active value for TestChildObject.TestInt.
   Confirmed by running the .NET reference's own probe:

     dotnet run --project src/MxNativeClient.Probe -c Release \
       -- --probe-session-subscribe --tag=TestChildObject.TestInt \
       --subscribe-hold-seconds=10

   ...returns ONE 0x32 SubscriptionStatus (status=3 detail=3
   quality=0x00C0 Uncertain value=null) and zero 0x33 DataUpdates —
   matching the Rust port's symptom exactly. Not a Rust port bug,
   not a wire-byte gap. F49 steps 1-3 need either an actively-
   scanned tag or local Galaxy reconfiguration to scan
   TestChildObject.TestInt.

Workspace tests + clippy clean under both feature configurations.
F56 entry in design/followups.md updated with the full diagnostic
chain so future-me / future-collaborators can pick it up without
re-tracing.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-06 10:27:08 -04:00
Joseph Doherty af15fe7587 [F49 step 1 + F56] callback router: peel envelope before parsing subscription / 0x11 frames
The router used to call NmxSubscriptionMessage::parse_inner directly
on the COM-stub-delivered body, but the wire bytes arrive wrapped in
a ProcessDataReceived envelope (46-byte header + optional 4-byte
length prefix); parse_inner expects post-envelope bytes. Result:
every 0x33 DataUpdate that ever arrived was silently dropped.

Mirrors the .NET reference's MxNativeSession.OnCallbackReceived flow
at cs:582-606 — three sequential parse attempts:
  1. NmxOperationStatusMessage::try_parse_process_data_received_body  (already wired)
  2. NmxReferenceRegistrationResultMessage::try_parse_...              (NEW — was missing)
  3. NmxSubscriptionMessage::try_parse_process_data_received_body      (NEW — was wrong)

Adds:
- NmxSubscriptionMessage::try_parse_process_data_received_body — peels
  envelope via NmxObservedEnvelope::parse_process_data_received_body_flexible,
  then dispatches to existing parse_inner.
- NmxReferenceRegistrationResultMessage::try_parse_process_data_received_body —
  same shape, for the 0x11 registration-result frame.
- Router branch for 0x11 — currently traces the assigned item_handle and
  drops the frame (matches the .NET reference, which fires a
  ReferenceRegistrationReceived event with no consumer in the codebase).
- Router fall-through trace! when neither path matches, so future
  unparseable bodies surface in RUST_LOG=trace instead of vanishing.
- DcomCallbackSink::forward — trace! per inbound callback so
  RUST_LOG=mxaccess_callback=trace surfaces opnum + size.
- crates/mxaccess-compat/tests/buffered_subscribe_live.rs — F49 step 1
  live test that drives subscribe_buffered + a 500ms-cadence writer.
  Also pulls tracing-subscriber as a dev-dep so the test can dump
  router activity.

Existing router_task_decodes_callback_invoked_into_broadcast unit test
updated to wrap its synthetic 0x32 body in an envelope so the new
parse path actually accepts it.

Live result: F56 — the buffered round-trip *registers* successfully
(RegisterReference returns HRESULT 0; engine sends one 0x11
RegistrationResult + one 51-byte op-status per write, perfectly
clocked) but the engine never sends a 0x33 DataUpdate. Rust-port-
specific gap vs the .NET reference's working buffered path; root
cause is likely a field-level difference in the RegisterReference
body or a missing post-RegisterReference step. Captured as F56 in
design/followups.md, blocking F49 step 1; F56's DoD is the same
live test reporting >=3 DataChange arrivals.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-06 09:50:57 -04:00
Joseph Doherty e5b31fadb1 [F49] live-test scaffolding for F54 OnWriteComplete + COM probe diagnostic
rust / build / test / clippy / fmt (push) Has been cancelled
rust / cargo public-api drift check (F41) (push) Has been cancelled
Live attempt against AVEVA on this dev host produced two artefacts:

**`crates/mxaccess-compat/tests/lmx_write_complete_live.rs`** — the
F54 OnWriteComplete round-trip test. Compiles + runs against the
live AVEVA install via either path:
- `--features live-windows-com` (preferred): uses
  `Session::connect_nmx_auto` so the COM activation reference is
  held in-process for the duration of the test.
- Default features (fallback): shells out to
  `MxNativeClient.Probe --probe-resolve-oxid-managed-ntlm-integrity`
  + `--probe-remqi-managed` to learn the per-session NMX endpoint +
  INmxService2 IPID, then uses `Session::connect_nmx`.

Both code paths are wired and the test runs through endpoint
resolution + IPID extraction successfully. The connect step itself
fails with `Status { detail: 1722 }` (RPC_S_SERVER_UNAVAILABLE).

**`crates/mxaccess-rpc/examples/com-marshal-probe.rs`** — minimal
one-shot binary that calls
`marshal_activated_iunknown_objref("NmxSvc.NmxService",
DifferentMachine)` in isolation. Confirms the COM activation +
CoMarshalInterface chain works fine standalone (returns a 338-byte
OBJREF with valid OXID/IPID structure). The 1722 in the live test
is therefore downstream of the activation — likely a COM-apartment
threading interaction with the tokio multi-thread runtime.

This is an F12-related issue (auto-resolve hardening), not an F54
issue. F54's correctness is covered by the existing unit-level
integration tests:
- `mxaccess::session::tests::router_populates_operation_status_context_from_pending_ops_fifo`
- `mxaccess::session::tests::write_handle_correlates_with_router_emitted_status`
- `mxaccess_compat::tests::drain_routes_write_status_to_on_write_complete`
- `mxaccess_compat::tests::drain_routes_non_write_status_to_on_operation_complete`

`design/followups.md` F49 entry updated to reflect:
- F54 added as a fifth row in the live-verification scope.
- "Live attempt 2026-05-06" sub-section documents the 1722 issue +
  what was verified (.NET probe end-to-end works against same
  install; Rust COM activation works in isolation; the failure is
  Rust-port-specific to `connect_nmx_auto` under tokio).
- F49 now Blocked-by F12 hardening (the 1722 path).

New `live-windows-com` feature on `mxaccess-compat` propagates to
`mxaccess/windows-com` for the test binary.

Workspace 824 → 824 tests; clippy + rustdoc clean across both
feature configurations.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-06 08:23:01 -04:00
Joseph Doherty f0c9dd2214 rust: add version specifiers to workspace path deps for cargo publish 2026-05-06 05:31:57 -04:00
Joseph Doherty d5aa152b1f [F35] mxaccess-compat: LMXProxyServer-shaped facade (18 methods)
Replace the 8-line `mxaccess-compat` stub with a real `LmxClient`
struct exposing the 18 `ILMXProxyServer5` methods as Rust async fns
on top of `mxaccess::Session` (NMX) and `mxaccess::AsbSession` (ASB).

Handle-table approach
* `Mutex<HashMap<i32, ItemRef>>` for item handles, populated by
  `add_item` / `add_item_2` / `add_buffered_item`, drained by
  `remove_item` / `unregister`.
* `Mutex<HashMap<i32, UserRef>>` for user handles allocated by
  `authenticate_user` / `archestra_user_to_id`.
* `AtomicI32` monotonic counters for both, matching the .NET
  reference's `_nextItemHandle` / `_nextUserHandles` per
  `MxNativeCompatibilityServer.cs:62-63`.

Stream-based event surface (per Q4)
* `OnDataChange` / `OnBufferedDataChange` / `OnWriteComplete` /
  `OperationComplete` exposed as `EventStream<T>: Stream<Item=T>`,
  backed by `tokio::sync::broadcast` channels. Lag silently skips
  past `BroadcastStream::Lagged` to keep the public `Item` shape
  ergonomic. NOT COM events — that's the post-V1
  `mxaccess-compat-com` crate per design/70-risks-and-open-questions.md
  Q4. The `OperationComplete` channel is wired but no firing path
  is modelled (R3 deferred — no captured byte mapping yet).
* `Advise` / `AdviseSupervisory` spawn a background fan-out task
  that drains the `Subscription` stream and routes each
  `DataChange` to either `on_data_change` or
  `on_buffered_data_change` based on the item's `is_buffered` flag.
  `UnAdvise` / `RemoveItem` abort the task.

Pass-through methods
* `Write` / `Write2` -> `Session::write` / `write_with_timestamp`
  (`userId` ignored — the underlying surface uses engine identity).
* `WriteSecured2` -> `Session::write_secured_at` with both user ids
  always passed (R6: single-user secured = same id twice; never
  gated).
* `AdviseSupervisory` collapses onto `Session::subscribe` because
  the wire path is `AdviseSupervisory` already (`session.rs:1057`),
  matching the .NET reference's `cs:251-259` identical collapse.
* `SetBufferedUpdateInterval` rounds up to nearest 100 ms per
  `MxNativeCompatibilityServer.cs:638`.

Stubbed pass-throughs (mirror upstream `Error::Unsupported`)
* `WriteSecured` (no timestamp) — `Session::write_secured` is
  stubbed at `crates/mxaccess/src/lib.rs:472` (only
  `WriteSecured2`/`0x3A` is ported); workaround documented inline.
* `AddBufferedItem` allocates the handle but `Advise` for buffered
  items does not yet drive `Session::subscribe_buffered` cadence
  knob — TODO(F36) flagged inline at `add_buffered_item` and
  `set_buffered_update_interval`.

Tests (25 new, all green)
* Handle-table lifecycle: Add -> Advise -> UnAdvise -> Remove with
  a mocked subscription task.
* Monotonic handle allocation; context-prefix combination.
* `SetBufferedUpdateInterval` rounding (50 -> 100, 101 -> 200, etc.)
  + zero-rejection.
* Compile-time check that all 18 LMX methods are reachable on
  `LmxClient`.
* Each event stream yields published items; lag silently dropped.
* GUID-shape validation; server-handle mismatch errors.

Build hygiene
* `cargo build -p mxaccess-compat` clean.
* `cargo test -p mxaccess-compat` -> 25 passed.
* `cargo clippy -p mxaccess-compat --all-targets -- -D warnings` clean.
* `RUSTDOCFLAGS=-D warnings cargo doc -p mxaccess-compat --no-deps` clean.

Deferred / TODOs
* TODO(F36): wire `set_buffered_update_interval` cadence into the
  `advise` path for buffered items.
* TODO(R3): plumb a real trigger into `on_operation_complete` once
  the byte mapping lands.
* TODO(wave 2): live integration tests against AVEVA.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-06 05:06:26 -04:00
Joseph Doherty fe2a6db786 Initial project state: .NET reference, design, Rust port (M0+M1), evidence
rust / build / test / clippy / fmt (push) Has been cancelled
Layout:
- src/                    .NET 10 x64 reference: MxNativeCodec, MxNativeClient,
                          MxAsbClient, probes, tests, harnesses. Executable spec.
- design/                 Architectural plan for the Rust port (M0–M6), error
                          model, protocol invariants, risks (R1–R16), adversarial
                          review log (review.md).
- rust/                   Rust workspace. M0 skeleton + M1 codec parity.
                          mxaccess-codec: 215 unit tests + 2 cross-implementation
                          parity tests (byte-identical against .NET reference).
                          Other crates are M0 stubs awaiting M2+.
- captures/               Frida + netsh + pcap evidence per CLAUDE.md
                          ("captures are evidence, not throwaway logs").
- analysis/               Decompiled C# (frida/proxy/decompiled-*),
                          Ghidra exports for native DLLs (`exports/` only —
                          working state at `projects/` and AVEVA's input
                          binaries at `input/` are gitignored).
- docs/                   Reverse-engineering reference docs.
- tools/                  Setup-LiveProbeEnv.ps1 (Infisical credential fetcher),
                          Compute-Crc.ps1 (.NET parity helper).
- .github/workflows/      Rust CI: fmt + build + test + clippy on Windows.
- LICENSE                 MIT (Joseph Doherty, 2026).

Verified:
- cargo test --workspace → 217 passed (215 unit + 2 .NET parity), 0 failed
- cargo clippy --workspace -- -D warnings → clean
- cargo fmt --all -- --check → clean
- cargo publish --dry-run -p mxaccess-codec → packages cleanly

Excluded from history (see .gitignore):
- **/bin, **/obj, **/target — build artifacts
- analysis/ghidra/projects/ — Ghidra working state (regenerable)
- analysis/ghidra/input/ — AVEVA proprietary DLLs (vendor IP)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-05 06:21:00 -04:00