Harden code-review tooling and align REVIEW-PROCESS.md with mxaccessgw

- regen-readme.py: use `python` not the broken `python3` Store alias in
  the generated note and docstring; --check now also fails when a module
  header's "Open findings" count disagrees with finding statuses or a
  finding has an unrecognised Status (find_inconsistencies)
- REVIEW-PROCESS.md: rewritten for mxaccessgw (was describing ScadaLink)
  — MxGateway.* modules, "mxaccessgw conventions" checklist category,
  gateway.md/docs/ design context, `python` command
- scripts/check-code-reviews-readme.ps1: CI/pre-commit wrapper for
  regen-readme.py --check
- code-reviews/test_regen_readme.py: dependency-free parser tests
- code-reviews/README.md: regenerated

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
Joseph Doherty
2026-05-18 16:36:25 -04:00
parent ae164ea34f
commit 3cc53a8c69
5 changed files with 308 additions and 66 deletions
+84 -59
View File
@@ -1,67 +1,84 @@
# Code Review Process
This document describes how to perform a comprehensive, per-module code review of
the ScadaLink codebase and how to track findings to resolution.
the `mxaccessgw` codebase and how to track findings to resolution.
A **module** is one buildable project under `src/` (e.g. `src/ScadaLink.TemplateEngine`).
Each module has its own folder under `code-reviews/` containing a single `findings.md`.
A **module** is one buildable project under `src/` (e.g. `src/MxGateway.Worker`).
Each module has its own folder under `code-reviews/` containing a single
`findings.md`.
## 1. Before you start
1. Pick the module to review. Its folder is `code-reviews/<Module>/` where `<Module>`
is the project name with the `ScadaLink.` prefix stripped.
1. Pick the module to review. Its folder is `code-reviews/<Module>/` where
`<Module>` is the project name with the `MxGateway.` prefix stripped — so
`src/MxGateway.Server` is reviewed in `code-reviews/Server/`.
2. Identify the design context for the module:
- Its component design doc: `docs/requirements/Component-<Name>.md`.
- The relevant **Key Design Decisions** in `CLAUDE.md`.
- `docs/requirements/HighLevelReqs.md` for cross-cutting requirements.
3. Record the exact commit being reviewed: `git rev-parse --short HEAD`. Every review
is a snapshot — a finding only means something relative to a known commit.
- `gateway.md` — top-level architecture, command/event surface, IPC envelope,
STA thread model, fault handling.
- The relevant component design docs under `docs/` (e.g.
`docs/MxAccessWorkerInstanceDesign.md`, `docs/GatewayProcessDesign.md`,
`docs/Sessions.md`, `docs/Authentication.md`, `docs/GalaxyRepository.md`).
- `docs/DesignDecisions.md` for the v1 design choices.
- The **Repository-Specific Conventions** and **Process / Platform Notes** in
`CLAUDE.md`.
3. Record the exact commit being reviewed: `git rev-parse --short HEAD`. Every
review is a snapshot — a finding only means something relative to a known
commit.
4. Open `code-reviews/<Module>/findings.md` and fill in the header table
(reviewer, date, commit SHA).
(reviewer, date, commit SHA, status).
## 2. Review checklist
Work through **every** category below for the module. A comprehensive review means
the checklist is completed even where it produces no findings — record "No issues
found" for a category rather than leaving it ambiguous.
Work through **every** category below for the module. A comprehensive review
means the checklist is completed even where it produces no findings — record
"No issues found" for a category rather than leaving it ambiguous.
1. **Correctness & logic bugs** — off-by-one, null handling, incorrect conditionals,
misuse of APIs, broken edge cases.
2. **Akka.NET conventions**supervision strategies (Resume for coordinators, Stop
for short-lived actors), `Tell` for hot paths / `Ask` only at system boundaries,
message immutability, no blocking on non-blocking dispatchers, no `sender`/`this`
captured in closures (`PipeTo` instead), correlation IDs on request/response.
3. **Concurrency & thread safety** — shared mutable state, actor state mutated only
on the actor thread, race conditions, correct use of async/await.
4. **Error handling & resilience** — exception paths, store-and-forward integration,
reconnect/retry logic, failover behaviour, transient vs permanent error
classification, graceful degradation.
5. **Security** — authentication/authorization checks, input validation, the script
trust model (forbidden APIs: `System.IO`, `Process`, `Threading`, `Reflection`,
raw network), secret handling, SQL/LDAP injection, logging of sensitive data.
6. **Performance & resource management**`IDisposable` disposal, stream/connection
lifetimes, buffering and back-pressure, unnecessary allocations, N+1 queries.
7. **Design-document adherence** — does the code match `Component-<Name>.md` and the
relevant CLAUDE.md decisions? Flag both code that drifts from the design and design
docs that are now stale.
8. **Code organization & conventions** — persistence-ignorant POCO entities in
Commons, repository interfaces in Commons / implementations in ConfigurationDatabase,
namespace hierarchy, Options pattern (options classes owned by component projects),
additive-only message contract evolution.
9. **Testing coverage** — are the module's behaviours covered by tests in `tests/`?
Note untested critical paths and missing edge-case tests.
1. **Correctness & logic bugs** — off-by-one, null handling, incorrect
conditionals, misuse of APIs, broken edge cases.
2. **mxaccessgw conventions**the rules in `CLAUDE.md` and the style guides
under `docs/style-guides/`: the gateway never instantiates MXAccess COM
directly; all MXAccess COM calls run on the worker's dedicated STA thread and
the STA loop pumps Windows messages; IPC uses one bidirectional named pipe per
worker carrying length-prefixed `WorkerEnvelope` protobuf frames; MXAccess
parity is the contract (don't "fix" surprising MXAccess behaviour, never
synthesize events); one worker and one event subscriber per session; the
gateway terminates orphan workers on startup and does not reattach; C# style
(file-scoped namespaces, `sealed` by default, `Async` suffix, MXAccess-aligned
names); no Blazor UI component libraries; no logging of secrets or full tag
values; generated code is never hand-edited.
3. **Concurrency & thread safety** — shared mutable state, STA affinity, race
conditions, correct use of `async`/`await`, locking, disposal races.
4. **Error handling & resilience** — exception paths, worker crash / reconnect
handling, fail-fast event backpressure, transient vs permanent error
classification, graceful degradation, correct gRPC status codes.
5. **Security** — authentication/authorization checks, API-key scope enforcement,
input validation, SQL injection in the Galaxy Repository RPCs, secret
handling, the dashboard anonymous-localhost bypass, logging of sensitive data.
6. **Performance & resource management**`IDisposable` disposal, pipe / stream
/ COM lifetimes, buffering and back-pressure, unnecessary allocations on hot
paths, N+1 queries.
7. **Design-document adherence** — does the code match `gateway.md`, the relevant
`docs/` component designs, `docs/DesignDecisions.md`, and `CLAUDE.md`? Flag
both code that drifts from the design and design docs that are now stale.
8. **Code organization & conventions** — namespace hierarchy, project layout, the
Options pattern, separation of concerns, additive-only contract evolution.
9. **Testing coverage** — are the module's behaviours covered by tests
(`src/MxGateway.Tests`, `src/MxGateway.Worker.Tests`,
`src/MxGateway.IntegrationTests`)? Note untested critical paths and missing
edge-case tests.
10. **Documentation & comments** — XML doc accuracy, misleading or stale comments,
undocumented non-obvious behaviour.
## 3. Recording findings
Add one entry per finding to the `## Findings` section of the module's `findings.md`,
using the entry format in [`_template/findings.md`](_template/findings.md).
Add one entry per finding to the `## Findings` section of the module's
`findings.md`, using the entry format in
[`_template/findings.md`](code-reviews/_template/findings.md).
- **Finding ID** — `<Module>-NNN`, numbered sequentially within the module and never
reused (e.g. `TemplateEngine-001`). IDs are permanent even after resolution.
- **Finding ID** — `<Module>-NNN`, numbered sequentially within the module and
never reused (e.g. `Worker-001`). IDs are permanent even after resolution.
- **Severity:**
- **Critical** — data loss, security breach, crash/deadlock, or cluster-wide outage.
- **Critical** — data loss, security breach, crash/deadlock, or outage.
- **High** — incorrect behaviour with significant impact; no safe workaround.
- **Medium** — incorrect or risky behaviour with limited impact or a workaround.
- **Low** — minor issues, style, maintainability, documentation.
@@ -70,44 +87,52 @@ using the entry format in [`_template/findings.md`](_template/findings.md).
- **Description** — what is wrong and why it matters.
- **Recommendation** — concrete suggested fix.
After recording findings, update the module header table (status, open-finding count)
and refresh the base README (step 5).
After recording findings, update the module header table (status, open-finding
count) and regenerate the base README (step 5).
## 4. Marking an item resolved
Findings are **never deleted** — they are an audit trail. To close one, change its
**Status** and complete the **Resolution** field:
Findings are **never deleted** — they are an audit trail. To close one, change
its **Status** and complete the **Resolution** field:
- `Open` — newly recorded, not yet addressed.
- `In Progress` — a fix is actively being worked on.
- `Resolved` — fixed. The Resolution field must state the fixing commit SHA, the
date, and a one-line description of the fix.
- `Won't Fix` — intentionally not fixed. The Resolution field must justify why.
- `Deferred` — valid but postponed. The Resolution field must say what it is waiting
on (e.g. a tracked issue or a later milestone).
- `Deferred` — valid but postponed. The Resolution field must say what it is
waiting on (e.g. a tracked issue or a later milestone).
`Resolved`, `Won't Fix`, and `Deferred` findings are all considered **closed** and
drop off the base README's pending list. `Open` and `In Progress` are **pending**.
`Resolved`, `Won't Fix`, and `Deferred` findings are all considered **closed**.
`Open` and `In Progress` are **pending** and appear in the base README's Pending
Findings table.
## 5. Updating the base README
`code-reviews/README.md` holds the single cross-module view (process overview, the
Pending Findings tables, and the Module Status table). It is **generated** from the
`code-reviews/README.md` holds the single cross-module view (the Module Status
table and the Pending / Closed Findings tables). It is **generated** from the
per-module `findings.md` files — do not edit it by hand.
After any review or status change, regenerate it:
```
python3 code-reviews/regen-readme.py
python code-reviews/regen-readme.py
```
`regen-readme.py --check` exits non-zero if `README.md` is stale, for use in CI.
`regen-readme.py --check` exits non-zero if `README.md` is stale, if a module
header's `Open findings` count disagrees with its finding statuses, or if a
finding carries an unrecognised Status value. The PowerShell wrapper
`scripts/check-code-reviews-readme.ps1` runs that check and is the intended hook
for CI or a pre-commit step.
> The repo's installed `python` is the real interpreter; the bare `python3`
> alias resolves to the Windows Store stub and fails. Use `python`.
The per-module `findings.md` files are the source of truth; `README.md` is the
aggregated index and must always agree with them — which the script guarantees.
## 6. Re-reviewing a module
Re-reviews append to the same `findings.md`. Update the header to the new commit and
date, continue the finding numbering from the last used ID, and leave prior findings
(including closed ones) in place as history.
Re-reviews append to the same `findings.md`. Update the header to the new commit
and date, continue the finding numbering from the last used ID, and leave prior
findings (including closed ones) in place as history.
+1 -1
View File
@@ -1,6 +1,6 @@
# Code Reviews
<!-- GENERATED FILE - do not edit by hand. Regenerate with: python3 code-reviews/regen-readme.py -->
<!-- GENERATED FILE - do not edit by hand. Regenerate with: python code-reviews/regen-readme.py -->
Cross-module code review index for the `mxaccessgw` codebase. The review process is defined in [../REVIEW-PROCESS.md](../REVIEW-PROCESS.md).
+45 -6
View File
@@ -6,8 +6,12 @@ them into the single cross-module README.md (module status + pending/closed
finding tables).
Usage:
python3 code-reviews/regen-readme.py # rewrite README.md
python3 code-reviews/regen-readme.py --check # exit 1 if README.md is stale
python code-reviews/regen-readme.py # rewrite README.md
python code-reviews/regen-readme.py --check # exit 1 if stale or inconsistent
`--check` fails when README.md is out of date OR when a module's header
`Open findings` count disagrees with its finding statuses, or a finding
carries an unrecognised Status value.
"""
from __future__ import annotations
@@ -19,11 +23,12 @@ ROOT = Path(__file__).resolve().parent
README = ROOT / "README.md"
PENDING_STATUSES = {"Open", "In Progress"}
KNOWN_STATUSES = {"Open", "In Progress", "Resolved", "Won't Fix", "Deferred"}
SEVERITY_ORDER = {"Critical": 0, "High": 1, "Medium": 2, "Low": 3}
GENERATED_NOTE = (
"<!-- GENERATED FILE - do not edit by hand. "
"Regenerate with: python3 code-reviews/regen-readme.py -->"
"Regenerate with: python code-reviews/regen-readme.py -->"
)
@@ -169,6 +174,32 @@ def build_readme(modules: list[dict]) -> str:
return "\n".join(out) + "\n"
def find_inconsistencies(modules: list[dict]) -> list[str]:
"""Return human-readable problems in the per-module findings.md files.
Checks that each module header's `Open findings` count agrees with its
finding statuses, and that every finding carries a known Status value.
"""
issues: list[str] = []
for m in modules:
open_n = sum(
1 for f in m["findings"] if f["status"] in PENDING_STATUSES
)
declared = m["header"].get("Open findings", "").strip()
if declared != str(open_n):
issues.append(
f"{m['module']}: header 'Open findings' = '{declared}' but "
f"{open_n} finding(s) are Open/In Progress"
)
for f in m["findings"]:
if f["status"] not in KNOWN_STATUSES:
issues.append(
f"{m['module']}: finding {f['id']} has unrecognised "
f"Status '{f['status']}'"
)
return issues
def main(argv: list[str]) -> int:
check = "--check" in argv[1:]
module_dirs = sorted(
@@ -178,16 +209,24 @@ def main(argv: list[str]) -> int:
)
modules = [parse_module(d / "findings.md") for d in module_dirs]
content = build_readme(modules)
issues = find_inconsistencies(modules)
if check:
current = README.read_text(encoding="utf-8") if README.exists() else ""
if current != content:
stale = (
README.read_text(encoding="utf-8") if README.exists() else ""
) != content
for issue in issues:
print(f"inconsistent: {issue}", file=sys.stderr)
if stale:
print(
"code-reviews/README.md is stale - run regen-readme.py",
file=sys.stderr,
)
if stale or issues:
return 1
print("code-reviews/README.md is up to date.")
print("code-reviews/README.md is up to date and consistent.")
return 0
for issue in issues:
print(f"warning: {issue}", file=sys.stderr)
README.write_text(content, encoding="utf-8", newline="\n")
print(f"Wrote {README} ({len(modules)} modules).")
return 0
+158
View File
@@ -0,0 +1,158 @@
#!/usr/bin/env python3
"""Tests for regen-readme.py.
Dependency-free: run with `python code-reviews/test_regen_readme.py`.
Exits 0 if all tests pass, 1 otherwise.
"""
from __future__ import annotations
import importlib.util
import tempfile
import traceback
from pathlib import Path
HERE = Path(__file__).resolve().parent
# regen-readme.py is not an importable module name (hyphen), so load it by path.
_spec = importlib.util.spec_from_file_location("regen_readme", HERE / "regen-readme.py")
regen = importlib.util.module_from_spec(_spec)
_spec.loader.exec_module(regen)
FIXTURE = """# Code Review — Demo
| Field | Value |
|---|---|
| Module | `src/Demo` |
| Reviewer | Tester |
| Review date | 2026-05-18 |
| Commit reviewed | `abc1234` |
| Status | Reviewed |
| Open findings | 1 |
## Findings
### Demo-001
| Field | Value |
|---|---|
| Severity | High |
| Category | Security |
| Location | `src/Demo/File.cs:10` |
| Status | Open |
**Description:** A first problem that matters.
**Recommendation:** Fix it.
**Resolution:** _(open)_
### Demo-002
| Field | Value |
|---|---|
| Severity | Low |
| Category | Documentation & comments |
| Location | `src/Demo/File.cs:20` |
| Status | Resolved |
**Description:** A second, minor problem.
**Recommendation:** Tidy it.
**Resolution:** Fixed in def5678 on 2026-05-18.
"""
def _parse_fixture() -> dict:
"""Write FIXTURE to a temp Demo/findings.md and parse it."""
with tempfile.TemporaryDirectory() as tmp:
path = Path(tmp) / "Demo" / "findings.md"
path.parent.mkdir()
path.write_text(FIXTURE, encoding="utf-8")
return regen.parse_module(path)
def test_first_table_skips_separator_and_field_header():
table = regen.first_table("| Field | Value |\n|---|---|\n| Severity | High |\n")
assert table == {"Severity": "High"}, table
def test_parse_module_header():
m = _parse_fixture()
assert m["module"] == "Demo", m["module"]
assert m["header"]["Reviewer"] == "Tester"
assert m["header"]["Status"] == "Reviewed"
assert m["header"]["Open findings"] == "1"
def test_parse_module_findings():
m = _parse_fixture()
assert len(m["findings"]) == 2, len(m["findings"])
first = m["findings"][0]
assert first["id"] == "Demo-001"
assert first["severity"] == "High"
assert first["category"] == "Security"
assert first["location"] == "`src/Demo/File.cs:10`"
assert first["status"] == "Open"
assert first["description"] == "A first problem that matters."
assert m["findings"][1]["status"] == "Resolved"
def test_build_readme_splits_pending_and_closed():
readme = regen.build_readme([_parse_fixture()])
assert "## Pending findings" in readme
assert "## Closed findings" in readme
pending, closed = readme.split("## Closed findings", 1)
assert "Demo-001" in pending # Open -> pending
assert "Demo-001" not in closed
assert "Demo-002" in closed # Resolved -> closed
assert "_No pending findings._" not in pending
def test_find_inconsistencies_clean_fixture():
assert regen.find_inconsistencies([_parse_fixture()]) == []
def test_find_inconsistencies_detects_wrong_open_count():
m = _parse_fixture()
m["header"]["Open findings"] = "7"
issues = regen.find_inconsistencies([m])
assert len(issues) == 1 and "Open findings" in issues[0], issues
def test_find_inconsistencies_detects_unknown_status():
m = _parse_fixture()
m["findings"][0]["status"] = "Bogus"
issues = regen.find_inconsistencies([m])
# Wrong status also shifts the open count, so expect the status issue present.
assert any("unrecognised Status" in i for i in issues), issues
def test_summarize_truncates_long_text():
long = "x" * 500
out = regen.summarize(long)
assert len(out) <= 240 and out.endswith(""), len(out)
assert regen.summarize("short") == "short"
def main() -> int:
tests = sorted(
(name, fn)
for name, fn in globals().items()
if name.startswith("test_") and callable(fn)
)
failed = 0
for name, fn in tests:
try:
fn()
print(f"PASS {name}")
except Exception: # noqa: BLE001 - test runner reports all failures
failed += 1
print(f"FAIL {name}")
traceback.print_exc()
print(f"\n{len(tests) - failed}/{len(tests)} passed.")
return 1 if failed else 0
if __name__ == "__main__":
raise SystemExit(main())
+20
View File
@@ -0,0 +1,20 @@
# Verifies code-reviews/README.md is regenerated from, and consistent with, the
# per-module findings.md files. Intended as a CI / pre-commit gate.
#
# Exits non-zero when README.md is stale, when a module header's "Open findings"
# count disagrees with its finding statuses, or when a finding carries an
# unrecognised Status value. See REVIEW-PROCESS.md section 5.
[CmdletBinding()]
param()
Set-StrictMode -Version Latest
$ErrorActionPreference = "Stop"
$repoRoot = Resolve-Path (Join-Path $PSScriptRoot "..")
$script = Join-Path $repoRoot "code-reviews/regen-readme.py"
# The bare `python3` alias on this platform resolves to the Windows Store stub;
# `python` is the real interpreter.
& python $script --check
exit $LASTEXITCODE