mxaccesscli: capture user-attribution investigation as a pick-up note

Adds a dated incident-style file at the tool root recording the full
arc of the User_Name attribution work over the last two days:

- Six runs tabulated (galaxy mode x advise variant x write variant x
  resulting User_Name) so the next agent can see what's already been
  ruled out.
- Current state of the CLI (auth, advise routing, WriteSecured) and
  the galaxy (eOSUserBased, ArchestraUsers role, engines deployed
  before the security change).
- Leading hypothesis: running aaEngine processes still operate under
  the original eNone security context because galaxy security is
  compiled at deploy time. Until the platform/engines are
  redeployed, auth_user_id stays at 1 and User_Name stays NULL.
- Concrete pick-up commands: undeploy/redeploy DevPlatform ->
  DevAppEngine -> TestArea -> TestMachine_001 via graccesscli, then
  re-run the trigger / ack-as-dohertj2 / clear sequence and query
  Events.
- Fallbacks if a clean redeploy doesn't change the answer
  (aaBootstrap restart, two-person Verified Write, attribute
  security classification check, cross-tool comparison vs Object
  Viewer / InTouch).

Linked from README.md's resource index so an agent landing on the
tool finds the open thread without spelunking commit history.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
Joseph Doherty
2026-05-04 01:48:09 -04:00
parent c8f31bd653
commit ac3368993e
2 changed files with 118 additions and 0 deletions
@@ -0,0 +1,117 @@
# User-attribution investigation — pick-up notes
**Started:** 2026-05-03 · **Last updated:** 2026-05-04 · **Status:** open
## Goal
Get the Historian's `Events.User_Name` / `Events.User_Account` columns to reflect the **authenticated user** (`dohertj2`) on `Alarm.Acknowledged` rows when an ack is issued via `mxa write <alarm>.AckMsg`. Anonymous writes (no `--username`) should remain NULL.
## What's been tried
Each row's `Alarm.Acknowledged` `User_Name`/`User_Account` field is the result we cared about.
| # | Galaxy security | Advise variant on ack | Write variant on ack | `auth_user_id` | `User_Name` | Notes |
| --- | --- | --- | --- | --- | --- | --- |
| 1 | `eNone` | `Advise` | `Write` | 1 | `DefaultUser` | Galaxy in Free-Access mode mapped every action onto `DefaultUser`. Even bad password / unknown user resolved to `userId=1`. |
| 2 | `eNone` | `Advise` (explicit, after wiring `AdviseSupervisory` for anonymous) | `Write` | 1 | `DefaultUser` | Advise-variant routing landed but galaxy still mapped to `DefaultUser`. |
| 3 | `eOSUserBased` (no role) | `Advise` | `Write` | 1 | NULL | Mode flip removed the `DefaultUser` over-attribution. `dohertj2` not yet attributed because user wasn't in any role. |
| 4 | `eOSUserBased` + `ArchestraUsers` role | `Advise` | `Write` | 1 | NULL | After role assignment. `Auth` proven to validate (bad password throws `ArgumentException` — fixed in `MxSession.Authenticate` to surface as `auth_user_id=0`). Still no `User_Name`. |
| 5 | same | `Advise` | `Write` (with `--client <hostname>`) | 1 | NULL | `User_NodeName` flowed from `--client` (now `DESKTOP-6JL3KKO`); confirms `--client` is just a workstation tag, not a security primitive. |
| 6 | same | `Advise` | **`WriteSecured`** (`--secured`, single-user) | 1 | NULL | `WriteSecured(currentUserId=verifierUserId=1, value)` succeeded with `MxCategoryOk` but did **not** change attribution. Conclusion: User_Name population is not gated on Write vs WriteSecured. |
## Current state of the code
- `mxa write` accepts `--username/--domain/--password``AuthenticateUser` → userId for `Write` / `WriteSecured`.
- `mxa write` accepts `--secured` (single-user Secured) and `--verifier-username/--verifier-domain/--verifier-password` (two-user Verified Write).
- Advise variant routes by user: `Advise` when authenticated, `AdviseSupervisory` when anonymous. See [`docs/usage.md`](docs/usage.md#advise-variant--operator-vs-supervisory).
- `MxSession.Authenticate` catches `ArgumentException` from the proxy and normalizes to `userId=0` → CLI emits clean `authentication-failed` envelope, exit 1.
## Current state of the galaxy (`ZB`)
| Setting | Value |
| --- | --- |
| `MxSecurityModelType` | `eOSUserBased` |
| `IGalaxyUsers.Count` | 0 (read-only collection; doesn't reflect runtime logins via MxAccess) |
| `IGalaxyRoles.Count` | 0 (same) |
| `IGalaxyGroups.Count` | 0 (same) |
| Configured users (per IDE) | `DESKTOP-6JL3KKO\dohertj2` assigned to `ArchestraUsers` |
| `DevPlatform`, `DevAppEngine` | **deployed before** the security mode change |
## Leading hypothesis (not yet tested)
**The running aaEngine processes are still operating under the original `eNone` security context** because galaxy security is compiled into the platform/engine deployment. New auth rules don't take effect for already-running engines until they are redeployed (or `aaBootstrap` is restarted).
This is consistent with:
- `auth_user_id` returning `1` on every successful authentication regardless of mode/user — the engine's runtime still treats authenticated calls as the legacy default operator.
- `User_Name=NULL` rather than `dohertj2` — the engine has no way to map our session's userId onto its (stale) user table.
- `IGalaxyUsers` not populating from MxAccess logins — that's the design (population requires IDE-driven login activity), but it also means we can't verify enrollment from the CLI side.
## Next step — pick up here
1. **Undeploy + redeploy `DevPlatform` and below**, then rerun the alarm sequence. Driving this through `graccesscli`:
```powershell
$CLI="c:/Users/dohertj2/Desktop/wwtools/graccesscli/src/ZB.MOM.WW.GRAccess.Cli/bin/x86/Debug/net48/ZB.MOM.WW.GRAccess.Cli.exe"
$HOST=$env:COMPUTERNAME
# Top-down undeploy (objects -> areas -> engine -> platform)
& $CLI objects undeploy --galaxy ZB --node $HOST --names TestMachine_001 # cascades, see graccesscli docs
& $CLI objects undeploy --galaxy ZB --node $HOST --names DevAppEngine
& $CLI objects undeploy --galaxy ZB --node $HOST --names DevPlatform
# Bottom-up redeploy
& $CLI objects deploy --galaxy ZB --node $HOST --names DevPlatform
& $CLI objects deploy --galaxy ZB --node $HOST --names DevAppEngine
& $CLI objects deploy --galaxy ZB --node $HOST --names TestArea
& $CLI objects deploy --galaxy ZB --node $HOST --names TestMachine_001
```
*(Confirm exact command surface via `graccesscli objects deploy --help` before running — undeploying a running engine drops live-data feeds.)*
2. **Re-run the alarm-attribution test** (the same sequence used in run #6):
```powershell
$MXA="c:/Users/dohertj2/Desktop/wwtools/mxaccesscli/src/MxAccess.Cli/bin/x86/Release/net48/mxa.exe"
$HOST=$env:COMPUTERNAME
& $MXA write TestMachine_001.TestAlarm002 true --type bool --client $HOST -t 10
& $MXA write TestMachine_001.TestAlarm002.AckMsg "post-redeploy ack" `
--username dohertj2 --domain $HOST --password Sonamu89 `
--secured --client $HOST -t 12 --llm-json
& $MXA write TestMachine_001.TestAlarm002 false --type bool --client $HOST -t 10
```
3. **Query the Historian** (template):
```sql
DECLARE @start datetime2 = DATEADD(minute, -3, GETDATE());
DECLARE @end datetime2 = GETDATE();
SELECT EventTime, Type, Alarm_State, Comment,
User_Name, User_Account, User_NodeName,
LEFT(CAST(Alarm_ID AS varchar(40)), 8) AS id_short
FROM Events
WHERE Source_ConditionVariable = 'TestMachine_001.TestAlarm002'
AND EventTime BETWEEN @start AND @end
ORDER BY EventTime ASC;
```
4. **Compare** — does `User_Name = dohertj2` (or `User_Account = DESKTOP-6JL3KKO\dohertj2`) on the `Alarm.Acknowledged` row?
## Fallbacks if redeploy doesn't change anything
- **Restart `aaBootstrap`.** Heavier-handed but forces the entire ArchestrA stack to reload galaxy config. (Will interrupt the test alarm cycle on `TestAlarm001`.)
- **Try a Verified Write** (two-person), supplying both `--username dohertj2` and `--verifier-username dohertj2` (or two different OS users). The audit subsystem may only fully attribute Verified Writes.
- **Check the attribute's security classification.** `TestAlarm002.AckMsg` may be `Free Access`, in which case the engine doesn't need to verify the user and may discard the identity. Set the security classification to `Operate` (or stricter) on the template, redeploy, retest. Per `aot/dev-guide/appendix-e-security-classifications.md`, only stricter classifications enforce user identity in the audit.
- **Sample a different attribute** that we know other tools (Object Viewer, InTouch) record `User_Name` on. If those clients also produce NULL, the issue is system-wide (galaxy still misconfigured); if they record a user, the issue is specific to the MxAccess Write path.
## Files touched in this investigation
- [`src/MxAccess.Cli/Mx/MxSession.cs`](src/MxAccess.Cli/Mx/MxSession.cs) — `Authenticate` catches `ArgumentException`.
- [`src/MxAccess.Cli/Mx/MxItem.cs`](src/MxAccess.Cli/Mx/MxItem.cs) — `WriteSecured` method.
- [`src/MxAccess.Cli/Commands/WriteCommand.cs`](src/MxAccess.Cli/Commands/WriteCommand.cs) — `--username/--domain/--password`, advise routing, `--secured/--verifier-*`.
- [`docs/usage.md`](docs/usage.md) — *Authentication*, *Advise variant*, *userId is session-scoped* sections.
## Galaxy state
Alarm `TestMachine_001.TestAlarm002` is **cleared** (`ACK_RTN`) at the close of run #6. No lingering active alarms. Other test attributes restored to their original values.
+1
View File
@@ -39,6 +39,7 @@ mxaccesscli/
| Agent rules for editing this CLI | [`AGENTS.md`](AGENTS.md) |
| Run the CLI / option reference / examples | [`docs/usage.md`](docs/usage.md) |
| MxAccess API surface, threading model, MxStatus semantics | [`docs/api-notes.md`](docs/api-notes.md) |
| Open investigation: getting `User_Name` populated on Historian alarm rows | [`2026-05-03-user-attribution-investigation.md`](2026-05-03-user-attribution-investigation.md) |
| Find a writeable tag in the live galaxy (so smoke tests have a target) | [`../grdb/README.md`](../grdb/README.md) |
| Read tag values via SQL retrieval (an alternative path) | [`../histdb/README.md`](../histdb/README.md) |