Files
lmxopcua/docs/plans/2026-06-03-documentation-audit.md
T

23 KiB
Raw Blame History

Documentation Audit Implementation Plan

For Claude: REQUIRED SUB-SKILL: Use superpowers-extended-cc:executing-plans to implement this plan task-by-task.

Goal: Audit and fix the 32 live reference docs in place so they are accurate against today's source and complete (every shipped feature documented).

Architecture: Approach C — a deterministic Phase 0 baseline (a re-runnable link/path checker + a code-first feature inventory) feeds grouped vertical passes (G1 server-core, G2 drivers, G3 security/operational, G4 client+CLI), each applying all four audit dimensions per doc, then a Phase 2 reconciliation of the shared index/root docs plus a final corpus-wide gate.

Tech Stack: Markdown docs; a small Python 3 checker script; the OtOpcUa .NET 10 source tree as the ground truth for cross-checking.

Design: docs/plans/2026-06-03-documentation-audit-design.md (read it for the decisions; they are settled).


Method note (read once)

This is a documentation deliverable — there is no xUnit suite to make red→green. The plan therefore adapts the TDD step shape: each task identifies findings → applies fixes → verifies with the Phase-0 gate (scoped) → commits. The executable verification is the structural checker (Task 1) plus per-task acceptance criteria. Do not invent unit tests for prose.

Hard rules (apply to EVERY task)

  1. Scope: edit ONLY the 32 in-scope files. Never edit out-of-scope tiers (docs/v1, docs/v2, docs/plans except this plan/design, docs/reqs, docs/v3, looseends.md). If an in-scope doc links into an out-of-scope tier and the target moved, fix the link in the live doc — never the historical artifact.
  2. Direction: docs change to match the code, never the reverse. If the code itself looks wrong, append a one-line entry to .docs-audit/code-bug-flags.md — do NOT change code.
  3. Evidence: every code-reality correction must be verified against a real source location; record file:line in the commit body or .docs-audit/notes.md. No fixes from memory or assumption.
  4. Git safety: stage files explicitly by path. NEVER git add . / git add -A. Never stage sql_login.txt, src/Server/ZB.MOM.WW.OtOpcUa.Host/pki/, or the .docs-audit/ scratch dir. Never echo the dev gateway API key into a tracked file. No force-push, no --no-verify.
  5. Branch: all work on docs/documentation-audit (already checked out).

Shared procedures (referenced by tasks as "Procedure P / C / Gate")

Gate — structural checker

python3 .docs-audit/check_links.py > .docs-audit/links-report.md 2>.docs-audit/links-summary.txt; cat .docs-audit/links-summary.txt

Exit 0 = zero issues. The report is tab-separated: file <TAB> kind <TAB> tag <TAB> raw-target <TAB> case-hint.

Procedure P — per-doc audit (apply all four dimensions to one doc)

  1. Read the whole doc.
  2. Structural — for each entry for this doc in .docs-audit/links-report.md: repair the broken link / repoint the dead src|tests|scripts|docs/... path to its verified current location / fix the case mismatch (use the case-hint column). Confirm every new target exists on disk.
  3. Stale-status — scan for state words (blocked, pending, not yet, planned, TODO, TBD, as of <date>, will, coming). For each, verify against source + git log + known facts (v2 feature-complete; native alarms verified working 2026-05-31). Rewrite to present-tense truth or delete if obsolete.
  4. Code-reality cross-check — for every technical claim (namespace, class, file, appsettings key, env var, CLI verb/flag, described behavior), open the cited source and verify. Fix the doc to match; record file:line evidence. Flag genuine code bugs to .docs-audit/code-bug-flags.md.
  5. Inline completeness — from this doc's slice of .docs-audit/inventory-diff.md, add small missing items that belong in an existing section (a missing config key, an undocumented flag, a one-paragraph gap). Whole-new-page gaps are deferred to the group completeness task (Procedure C).
  6. Verify — run the Gate; confirm zero issues attributable to this doc; eyeball that tables/code-fences/lists still render.
  7. Commit this one doc by explicit path: git add <doc> && git commit -m "docs(audit): <doc> — accuracy + completeness pass".
  1. Take this group's domain slice of .docs-audit/inventory-diff.md (features with no doc coverage at all).
  2. For each, write the documentation: a new page under the appropriate dir, or a new section in the most relevant existing in-scope doc (judgment — prefer extending an existing doc over a thin new page).
  3. Group-local index only: G2 may update docs/drivers/README.md. Do not touch docs/README.md (top-level index) here — append each new top-level page to .docs-audit/new-pages.md for Task 26 (G5) to link in one place, avoiding cross-group collisions on the shared index.
  4. Run the Gate; commit new/edited files by explicit path.

Phase 0 — deterministic baseline + code-first inventory

Task 1: Structural checker script + initial run

Classification: small Estimated implement time: ~5 min Parallelizable with: Task 2

Files:

  • Create: .docs-audit/check_links.py (untracked scratch — never committed)
  • Create (untracked): .docs-audit/links-report.md, .docs-audit/links-summary.txt

Step 1: Ensure scratch dir is ignored. If .docs-audit/ is not already covered by .gitignore, add the line .docs-audit/ to .gitignore and commit that one-line change (git add .gitignore && git commit -m "chore: ignore .docs-audit scratch dir"). This is the only non-doc file the plan commits.

Step 2: Write .docs-audit/check_links.py:

#!/usr/bin/env python3
"""Structural link/path checker for the documentation audit (Phase 0 + final gate).
Scans the 32 in-scope live-reference docs, resolves every markdown link and inline
src|tests|scripts|docs path against the filesystem, and reports MISSING / CASE-MISMATCH."""
import os, re, sys, glob

REPO = os.path.abspath(os.path.join(os.path.dirname(__file__), ".."))

def in_scope():
    files  = sorted(glob.glob(os.path.join(REPO, "docs", "*.md")))
    files += sorted(glob.glob(os.path.join(REPO, "docs", "drivers", "*.md")))
    files += [os.path.join(REPO, "README.md"), os.path.join(REPO, "CLAUDE.md")]
    return [f for f in files if os.path.isfile(f)]

LINK_RE = re.compile(r"\[[^\]]*\]\(([^)]+)\)")
PATH_RE = re.compile(r"`?((?:src|tests|scripts|docs)/[A-Za-z0-9_./-]+)`?")

def case_insensitive_hint(path):
    d, name = os.path.split(path)
    if not os.path.isdir(d):
        return None
    for entry in os.listdir(d):
        if entry.lower() == name.lower():
            return os.path.join(d, entry)
    return None

def check(f):
    base = os.path.dirname(f)
    text = open(f, encoding="utf-8").read()
    out = []
    targets = [("link", m.group(1)) for m in LINK_RE.finditer(text)]
    targets += [("path", m.group(1)) for m in PATH_RE.finditer(text)]
    for kind, raw in targets:
        t = raw.split("#")[0].strip()
        if not t or re.match(r"^[a-z]+://", t) or t.startswith("mailto:"):
            continue
        if kind == "link":
            cand = os.path.normpath(os.path.join(base, t))
        else:
            cand = os.path.normpath(os.path.join(REPO, t.rstrip("./")))
        if os.path.exists(cand):
            continue
        hint = case_insensitive_hint(cand)
        tag = "CASE-MISMATCH" if hint else "MISSING"
        out.append((os.path.relpath(f, REPO), kind, tag, raw,
                    os.path.relpath(hint, REPO) if hint else ""))
    return out

def main():
    docs = in_scope()
    issues = [row for f in docs for row in check(f)]
    for rel, kind, tag, raw, hint in issues:
        print(f"{rel}\t{kind}\t{tag}\t{raw}\t{hint}")
    print(f"{len(issues)} issue(s) across {len(docs)} docs", file=sys.stderr)
    sys.exit(1 if issues else 0)

if __name__ == "__main__":
    main()

Step 3: Run it (Gate). Expected on first run: a non-empty report (at minimum the CLAUDE.mddocs/Security.md case mismatch and the AlarmTracking.md orphan situation surface here). Confirm the script runs without a Python traceback and the count printed to stderr matches the report line count.

Step 4: Do NOT commit the script or reports (they are under the now-ignored .docs-audit/). Only the .gitignore line from Step 1 is committed.

Acceptance: check_links.py runs clean (no traceback), emits a tab-separated report, exits non-zero while issues remain. This same command is the per-task and final gate.


Task 2: Code-first feature inventory + coverage diff

Classification: standard Estimated implement time: ~5 min (broad enumeration — split into sub-runs if needed) Parallelizable with: Task 1

Files:

  • Create (untracked): .docs-audit/inventory.md, .docs-audit/inventory-diff.md

Step 1: Enumerate the shipped surface from source into .docs-audit/inventory.md, grouped by domain so Procedure C can slice it:

  • Drivers (G2 domain) — every family under src/Drivers/ (AbCip, AbLegacy, FOCAS, Galaxy, Historian.Wonderware, Modbus, OpcUaClient, S7, TwinCAT). For each, note the driver class + which capability interfaces it implements.
  • Capabilities (G1 domain) — the interfaces in src/Core/ZB.MOM.WW.OtOpcUa.Core.Abstractions/ (IReadable, IWritable, ITagDiscovery, ISubscribable, IAlarmSource, IHistoryProvider, IHostConnectivityProbe, IPerCallHostResolver, plus IDriver*, IAddressSpaceBuilder, IRediscoverable).
  • Config surface (G3 domain) — top-level sections across src/Server/ZB.MOM.WW.OtOpcUa.Host/appsettings*.json and their bound Options classes (e.g. Security, Authentication.Ldap, Redundancy, MxAccess). List documented env vars (OTOPCUA_ROLES, …).
  • Security profiles (G3 domain) — the exact profile strings SecurityProfileResolver resolves (grep src/Server/ZB.MOM.WW.OtOpcUa.Security/).
  • CLI surface (G4 domain) — command verbs + options from the System.CommandLine definitions in src/Client/ZB.MOM.WW.OtOpcUa.Client.CLI/ and each driver CLI under src/Drivers/Cli/.

Step 2: Compute the coverage diff into .docs-audit/inventory-diff.md. For each inventory item, grep the 32 in-scope docs for its primary token; mark COVERED / PARTIAL / MISSING. Helper:

grep -RIl --include='*.md' "<token>" docs/*.md docs/drivers/*.md README.md CLAUDE.md

Keep only PARTIAL/MISSING rows in the diff, tagged with the owning domain (G1G4). This is the completeness worklist consumed by Procedure P step 5 (small/partial) and Procedure C (missing whole pages).

Step 3: No commit (scratch only).

Acceptance: inventory.md lists every shipped driver/capability/config-section/security-profile/CLI-verb with a source location; inventory-diff.md enumerates the gaps tagged by domain. A spot-check of 3 random inventory rows resolves to real source.


Phase 1 — grouped vertical passes

All Phase 1 tasks are blockedBy Task 1 and Task 2. Every per-doc accuracy task edits only its own doc(s) → all are mutually parallelizable (disjoint files). Each group's completeness task (Procedure C) is blockedBy that group's accuracy tasks.

G1 — Server core & data path

Task 3: OpcUaServer.md

Classification: standard · ~5 min · Parallelizable with: all other Phase-1 accuracy tasks (Tasks 47, 913, 1518, 2024) Files: Modify docs/OpcUaServer.md Apply Procedure P. Doc-specific focus: Core/driver-dispatch/Config-DB/generations claims vs src/Core + src/Server; verify CapabilityInvoker, GenericDriverNodeManager, generation-diff references resolve.

Task 4: AddressSpace.md

Classification: standard · ~5 min · Parallelizable with: Tasks 3, 57, 913, 1518, 2024 Files: Modify docs/AddressSpace.md Apply Procedure P. Focus: GenericDriverNodeManager, ITagDiscovery, IAddressSpaceBuilder, DataTypeMap.cs path.

Task 5: ReadWriteOperations.md + IncrementalSync.md

Classification: small · ~5 min · Parallelizable with: Tasks 3,4,6,7,913,1518,2024 Files: Modify docs/ReadWriteOperations.md, docs/IncrementalSync.md Apply Procedure P to each. Focus: CapabilityInvokerIReadable/IWritable; sp_ComputeGenerationDiff + rebuild-on-redeploy.

Task 6: VirtualTags.md + ScriptedAlarms.md

Classification: small · ~5 min · Parallelizable with: Tasks 35,7,913,1518,2024 Files: Modify docs/VirtualTags.md, docs/ScriptedAlarms.md Apply Procedure P to each. Focus: Core.Scripting/Core.VirtualTags/Core.ScriptedAlarms (Roslyn sandbox, Part 9 state machine). Cross-check against the named Core projects.

Task 7: AlarmTracking.md (orphan resolution)

Classification: small · ~4 min · Parallelizable with: Tasks 36,913,1518,2024 Files: Modify docs/AlarmTracking.md (and/or decide retirement) Known finding: the README index links to docs/v1/AlarmTracking.md, not this top-level file → it is likely orphaned. Apply Procedure P, then decide: (a) if it duplicates the v1 archive, replace its body with a short current-state pointer to the live alarm story (native alarms work end-to-end) + the v1 archive link; or (b) if it carries unique current content, keep & fix it and ensure Task 26 links it from docs/README.md. Record the decision in the commit body. Do not delete the file without noting why.

Classification: standard · ~5 min · Parallelizable with: other groups' completeness tasks (14, 19, 25) blockedBy: Tasks 3,4,5,6,7 Files: Create/Modify server-core docs as needed; append new top-level pages to .docs-audit/new-pages.md Apply Procedure C for the G1 (capabilities/server-core) slice of inventory-diff.md. Likely candidates: any capability interface or Core subsystem (e.g. Core.AlarmHistorian) with no live-doc home.

G2 — Drivers

Task 9: docs/drivers/README.md (index + capability matrix)

Classification: standard · ~5 min · Parallelizable with: Tasks 37,1013,1518,2024 Files: Modify docs/drivers/README.md Apply Procedure P. Focus: the eight-driver count + capability matrix vs the actual src/Drivers/ families and the interfaces each implements (from inventory.md). Correct the matrix to match reality.

Task 10: docs/drivers/Galaxy.md

Classification: standard · ~5 min · Parallelizable with: Tasks 37,9,1113,1518,2024 Files: Modify docs/drivers/Galaxy.md Apply Procedure P. Focus: in-process gRPC client → mxaccessgw sidecar; GalaxyDriver, IGalaxyHierarchySource, DeployWatcher, contained-name↔tag-name translation vs src/Drivers/ZB.MOM.WW.OtOpcUa.Driver.Galaxy/.

Task 11: drivers/FOCAS.md + FOCAS-Test-Fixture.md

Classification: small · ~5 min · Parallelizable with: Tasks 37,9,10,12,13,1518,2024 Files: Modify docs/drivers/FOCAS.md, docs/drivers/FOCAS-Test-Fixture.md Apply Procedure P to each vs src/Drivers/ZB.MOM.WW.OtOpcUa.Driver.FOCAS.

Task 12: Modbus + AbServer + AbLegacy test-fixture docs

Classification: small · ~5 min · Parallelizable with: Tasks 37,911,13,1518,2024 Files: Modify docs/drivers/Modbus-Test-Fixture.md, docs/drivers/AbServer-Test-Fixture.md, docs/drivers/AbLegacy-Test-Fixture.md Apply Procedure P to each. Focus: docker-host endpoints (10.100.0.35), fixture compose paths, lmxopcua labels vs tests/.../Docker/ + CLAUDE.md Docker section.

Task 13: S7 + TwinCAT + OpcUaClient test-fixture docs

Classification: small · ~5 min · Parallelizable with: Tasks 37,912,1518,2024 Files: Modify docs/drivers/S7-Test-Fixture.md, docs/drivers/TwinCAT-Test-Fixture.md, docs/drivers/OpcUaClient-Test-Fixture.md Apply Procedure P to each (same fixture/endpoint focus as Task 12).

Task 14: G2 completeness & drivers index

Classification: standard · ~5 min · Parallelizable with: Tasks 8,19,25 blockedBy: Tasks 9,10,11,12,13 Files: Create new docs/drivers/*.md as needed; Modify docs/drivers/README.md (group-local index) Apply Procedure C for the G2 (drivers) slice. Likely candidates: any src/Drivers/ family lacking a dedicated doc (e.g. AbCip/AbLegacy/S7/TwinCAT/Modbus/OpcUaClient have CLI docs + fixtures but may lack a driver-overview page like Galaxy/FOCAS). Link any new page from docs/drivers/README.md. Top-level links → .docs-audit/new-pages.md.

G3 — Security & operational

Task 15: security.md

Classification: standard · ~5 min · Parallelizable with: Tasks 37,913,1618,2024 Files: Modify docs/security.md Apply Procedure P. Focus: transport-security profile strings (vs SecurityProfileResolver), LDAP auth + group→role mapping, ACL trie, role grants, the OTOPCUA0001 analyzer. This is the highest-value accuracy doc — verify every profile/role/config-key against source.

Task 16: Redundancy.md

Classification: standard · ~5 min · Parallelizable with: Tasks 37,913,15,17,18,2024 Files: Modify docs/Redundancy.md Apply Procedure P. Focus: RedundancyCoordinator, ServiceLevelCalculator, apply-lease, RedundancySupport/ServerUriArray/ServiceLevel, Prometheus metrics vs src/Server/ZB.MOM.WW.OtOpcUa.ControlPlane/Runtime.

Task 17: ServiceHosting.md

Classification: small · ~5 min · Parallelizable with: Tasks 37,913,15,16,18,2024 Files: Modify docs/ServiceHosting.md Apply Procedure P. Focus: single fused OtOpcUa.Host binary, OTOPCUA_ROLES gating (admin/driver/both), AddWindowsService, the optional Wonderware Historian sidecar vs src/Server/ZB.MOM.WW.OtOpcUa.Host.

Task 18: Reservations.md + StatusDashboard.md (stub resolution)

Classification: small · ~5 min · Parallelizable with: Tasks 37,913,1517,2024 Files: Modify docs/Reservations.md, docs/StatusDashboard.md Apply Procedure P to Reservations.md (ZTag/SAPID external-ID reservations, publish-time claim/release). StatusDashboard.md is a known stub pointer (superseded by v2/admin-ui.md, which is out of scope): verify the pointer target still exists and the supersession statement is accurate; keep it a clean pointer (do not expand). If v2/admin-ui.md moved, fix the link only.

Classification: standard · ~4 min · Parallelizable with: Tasks 8,14,25 blockedBy: Tasks 15,16,17,18 Files: Create/Modify security/operational docs as needed; append top-level pages to .docs-audit/new-pages.md Apply Procedure C for the G3 (config/security/operational) slice — any appsettings section, security profile, or operational subsystem with no live-doc coverage.

G4 — Client & CLI tooling

Task 20: Client.CLI.md

Classification: standard · ~5 min · Parallelizable with: Tasks 37,913,1518,2124 Files: Modify docs/Client.CLI.md Apply Procedure P. Focus: otopcua-cli verbs/flags (connect/read/write/browse/subscribe/historyread/alarms/redundancy) vs the System.CommandLine defs in src/Client/ZB.MOM.WW.OtOpcUa.Client.CLI/. Every documented command/flag must exist; every shipped command must be documented.

Task 21: Client.UI.md

Classification: small · ~4 min · Parallelizable with: Tasks 37,913,1518,20,2224 Files: Modify docs/Client.UI.md Apply Procedure P vs src/Client/ZB.MOM.WW.OtOpcUa.Client.UI (Avalonia desktop client).

Task 22: DriverClis.md (index + shared commands)

Classification: standard · ~5 min · Parallelizable with: Tasks 37,913,1518,20,21,23,24 Files: Modify docs/DriverClis.md Apply Procedure P. Focus: the index must list exactly the driver CLIs that ship under src/Drivers/Cli/; shared command set matches the common base.

Task 23: Driver.Modbus/AbCip/AbLegacy CLI docs

Classification: small · ~5 min · Parallelizable with: Tasks 37,913,1518,2022,24 Files: Modify docs/Driver.Modbus.Cli.md, docs/Driver.AbCip.Cli.md, docs/Driver.AbLegacy.Cli.md Apply Procedure P to each vs the matching CLI project under src/Drivers/Cli/. Verify verbs/flags + the documented device families.

Task 24: Driver.S7/TwinCAT/FOCAS CLI docs

Classification: small · ~5 min · Parallelizable with: Tasks 37,913,1518,2023 Files: Modify docs/Driver.S7.Cli.md, docs/Driver.TwinCAT.Cli.md, docs/Driver.FOCAS.Cli.md Apply Procedure P to each vs the matching CLI project under src/Drivers/Cli/.

Classification: standard · ~4 min · Parallelizable with: Tasks 8,14,19 blockedBy: Tasks 20,21,22,23,24 Files: Create/Modify client/CLI docs as needed; append top-level pages to .docs-audit/new-pages.md Apply Procedure C for the G4 (client/CLI) slice — any CLI verb or client surface with no doc coverage.


Phase 2 — reconciliation & final gate

Task 26: G5 reconciliation — README index + CLAUDE.md

Classification: standard · ~5 min · Parallelizable with: none blockedBy: Tasks 8,14,19,25 Files: Modify docs/README.md, CLAUDE.md

  1. README index integrity: every doc listed in docs/README.md exists and is described correctly; every new page recorded in .docs-audit/new-pages.md is added to the right table; resolve the AlarmTracking.md link per Task 7's decision; verify all "superseded by" pointers.
  2. CLAUDE.md reconciliation: fix the docs/security.md vs docs/Security.md case mismatch (canonical filename is lowercase security.md); verify the docs CLAUDE.md names as canonical exist; reconcile any retired-project / status notes against current reality.
  3. Run the Gate; commit both files by explicit path.

Acceptance: Gate attributes zero issues to README.md/CLAUDE.md; both security.md references use the on-disk casing; every new page is linked.

Task 27: Final gate + change summary

Classification: small · ~4 min · Parallelizable with: none blockedBy: Task 26 Files: none committed (verification + reporting only)

  1. Structural gate (corpus-wide): run the Gate → exit 0, 0 issue(s). If any remain, they are unfixed findings — return to the owning doc's task, do not hand-wave.
  2. Completeness gate: re-run the Task-2 coverage diff → every inventory item is COVERED, or each remaining gap is listed in the summary with an explicit reason for exclusion (e.g. "out-of-scope tier owns it").
  3. Assemble the change summary (deliver in chat, do not commit): fixes grouped by dimension (structural / stale-status / code-reality / completeness), the list of new docs written, the contents of .docs-audit/code-bug-flags.md (code bugs flagged-not-fixed), and any deliberate completeness exclusions.

Acceptance: both gates green; change summary delivered.


Execution order & parallelism summary

  • Phase 0: Tasks 1 ∥ 2 (no deps).
  • Phase 1: after Phase 0, all accuracy tasks (37, 913, 1518, 2024) run in parallel — disjoint files. Each group's completeness task (8, 14, 19, 25) follows its group's accuracy tasks; the four completeness tasks are mutually parallel.
  • Phase 2: Task 26 after all completeness tasks; Task 27 after 26.