Files
wwtools/mxaccesscli/2026-05-03-user-attribution-investigation.md
Joseph Doherty ac3368993e mxaccesscli: capture user-attribution investigation as a pick-up note
Adds a dated incident-style file at the tool root recording the full
arc of the User_Name attribution work over the last two days:

- Six runs tabulated (galaxy mode x advise variant x write variant x
  resulting User_Name) so the next agent can see what's already been
  ruled out.
- Current state of the CLI (auth, advise routing, WriteSecured) and
  the galaxy (eOSUserBased, ArchestraUsers role, engines deployed
  before the security change).
- Leading hypothesis: running aaEngine processes still operate under
  the original eNone security context because galaxy security is
  compiled at deploy time. Until the platform/engines are
  redeployed, auth_user_id stays at 1 and User_Name stays NULL.
- Concrete pick-up commands: undeploy/redeploy DevPlatform ->
  DevAppEngine -> TestArea -> TestMachine_001 via graccesscli, then
  re-run the trigger / ack-as-dohertj2 / clear sequence and query
  Events.
- Fallbacks if a clean redeploy doesn't change the answer
  (aaBootstrap restart, two-person Verified Write, attribute
  security classification check, cross-tool comparison vs Object
  Viewer / InTouch).

Linked from README.md's resource index so an agent landing on the
tool finds the open thread without spelunking commit history.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-04 01:48:09 -04:00

7.9 KiB

User-attribution investigation — pick-up notes

Started: 2026-05-03 · Last updated: 2026-05-04 · Status: open

Goal

Get the Historian's Events.User_Name / Events.User_Account columns to reflect the authenticated user (dohertj2) on Alarm.Acknowledged rows when an ack is issued via mxa write <alarm>.AckMsg. Anonymous writes (no --username) should remain NULL.

What's been tried

Each row's Alarm.Acknowledged User_Name/User_Account field is the result we cared about.

# Galaxy security Advise variant on ack Write variant on ack auth_user_id User_Name Notes
1 eNone Advise Write 1 DefaultUser Galaxy in Free-Access mode mapped every action onto DefaultUser. Even bad password / unknown user resolved to userId=1.
2 eNone Advise (explicit, after wiring AdviseSupervisory for anonymous) Write 1 DefaultUser Advise-variant routing landed but galaxy still mapped to DefaultUser.
3 eOSUserBased (no role) Advise Write 1 NULL Mode flip removed the DefaultUser over-attribution. dohertj2 not yet attributed because user wasn't in any role.
4 eOSUserBased + ArchestraUsers role Advise Write 1 NULL After role assignment. Auth proven to validate (bad password throws ArgumentException — fixed in MxSession.Authenticate to surface as auth_user_id=0). Still no User_Name.
5 same Advise Write (with --client <hostname>) 1 NULL User_NodeName flowed from --client (now DESKTOP-6JL3KKO); confirms --client is just a workstation tag, not a security primitive.
6 same Advise WriteSecured (--secured, single-user) 1 NULL WriteSecured(currentUserId=verifierUserId=1, value) succeeded with MxCategoryOk but did not change attribution. Conclusion: User_Name population is not gated on Write vs WriteSecured.

Current state of the code

  • mxa write accepts --username/--domain/--passwordAuthenticateUser → userId for Write / WriteSecured.
  • mxa write accepts --secured (single-user Secured) and --verifier-username/--verifier-domain/--verifier-password (two-user Verified Write).
  • Advise variant routes by user: Advise when authenticated, AdviseSupervisory when anonymous. See docs/usage.md.
  • MxSession.Authenticate catches ArgumentException from the proxy and normalizes to userId=0 → CLI emits clean authentication-failed envelope, exit 1.

Current state of the galaxy (ZB)

Setting Value
MxSecurityModelType eOSUserBased
IGalaxyUsers.Count 0 (read-only collection; doesn't reflect runtime logins via MxAccess)
IGalaxyRoles.Count 0 (same)
IGalaxyGroups.Count 0 (same)
Configured users (per IDE) DESKTOP-6JL3KKO\dohertj2 assigned to ArchestraUsers
DevPlatform, DevAppEngine deployed before the security mode change

Leading hypothesis (not yet tested)

The running aaEngine processes are still operating under the original eNone security context because galaxy security is compiled into the platform/engine deployment. New auth rules don't take effect for already-running engines until they are redeployed (or aaBootstrap is restarted).

This is consistent with:

  • auth_user_id returning 1 on every successful authentication regardless of mode/user — the engine's runtime still treats authenticated calls as the legacy default operator.
  • User_Name=NULL rather than dohertj2 — the engine has no way to map our session's userId onto its (stale) user table.
  • IGalaxyUsers not populating from MxAccess logins — that's the design (population requires IDE-driven login activity), but it also means we can't verify enrollment from the CLI side.

Next step — pick up here

  1. Undeploy + redeploy DevPlatform and below, then rerun the alarm sequence. Driving this through graccesscli:

    $CLI="c:/Users/dohertj2/Desktop/wwtools/graccesscli/src/ZB.MOM.WW.GRAccess.Cli/bin/x86/Debug/net48/ZB.MOM.WW.GRAccess.Cli.exe"
    $HOST=$env:COMPUTERNAME
    
    # Top-down undeploy (objects -> areas -> engine -> platform)
    & $CLI objects undeploy --galaxy ZB --node $HOST --names TestMachine_001  # cascades, see graccesscli docs
    & $CLI objects undeploy --galaxy ZB --node $HOST --names DevAppEngine
    & $CLI objects undeploy --galaxy ZB --node $HOST --names DevPlatform
    
    # Bottom-up redeploy
    & $CLI objects deploy --galaxy ZB --node $HOST --names DevPlatform
    & $CLI objects deploy --galaxy ZB --node $HOST --names DevAppEngine
    & $CLI objects deploy --galaxy ZB --node $HOST --names TestArea
    & $CLI objects deploy --galaxy ZB --node $HOST --names TestMachine_001
    

    (Confirm exact command surface via graccesscli objects deploy --help before running — undeploying a running engine drops live-data feeds.)

  2. Re-run the alarm-attribution test (the same sequence used in run #6):

    $MXA="c:/Users/dohertj2/Desktop/wwtools/mxaccesscli/src/MxAccess.Cli/bin/x86/Release/net48/mxa.exe"
    $HOST=$env:COMPUTERNAME
    
    & $MXA write TestMachine_001.TestAlarm002 true --type bool --client $HOST -t 10
    & $MXA write TestMachine_001.TestAlarm002.AckMsg "post-redeploy ack" `
        --username dohertj2 --domain $HOST --password Sonamu89 `
        --secured --client $HOST -t 12 --llm-json
    & $MXA write TestMachine_001.TestAlarm002 false --type bool --client $HOST -t 10
    
  3. Query the Historian (template):

    DECLARE @start datetime2 = DATEADD(minute, -3, GETDATE());
    DECLARE @end   datetime2 = GETDATE();
    SELECT EventTime, Type, Alarm_State, Comment,
           User_Name, User_Account, User_NodeName,
           LEFT(CAST(Alarm_ID AS varchar(40)), 8) AS id_short
    FROM Events
    WHERE Source_ConditionVariable = 'TestMachine_001.TestAlarm002'
      AND EventTime BETWEEN @start AND @end
    ORDER BY EventTime ASC;
    
  4. Compare — does User_Name = dohertj2 (or User_Account = DESKTOP-6JL3KKO\dohertj2) on the Alarm.Acknowledged row?

Fallbacks if redeploy doesn't change anything

  • Restart aaBootstrap. Heavier-handed but forces the entire ArchestrA stack to reload galaxy config. (Will interrupt the test alarm cycle on TestAlarm001.)
  • Try a Verified Write (two-person), supplying both --username dohertj2 and --verifier-username dohertj2 (or two different OS users). The audit subsystem may only fully attribute Verified Writes.
  • Check the attribute's security classification. TestAlarm002.AckMsg may be Free Access, in which case the engine doesn't need to verify the user and may discard the identity. Set the security classification to Operate (or stricter) on the template, redeploy, retest. Per aot/dev-guide/appendix-e-security-classifications.md, only stricter classifications enforce user identity in the audit.
  • Sample a different attribute that we know other tools (Object Viewer, InTouch) record User_Name on. If those clients also produce NULL, the issue is system-wide (galaxy still misconfigured); if they record a user, the issue is specific to the MxAccess Write path.

Files touched in this investigation

Galaxy state

Alarm TestMachine_001.TestAlarm002 is cleared (ACK_RTN) at the close of run #6. No lingering active alarms. Other test attributes restored to their original values.