diff --git a/REVIEW-PROCESS.md b/REVIEW-PROCESS.md index 1578f87..bec630e 100644 --- a/REVIEW-PROCESS.md +++ b/REVIEW-PROCESS.md @@ -1,67 +1,84 @@ # Code Review Process This document describes how to perform a comprehensive, per-module code review of -the ScadaLink codebase and how to track findings to resolution. +the `mxaccessgw` codebase and how to track findings to resolution. -A **module** is one buildable project under `src/` (e.g. `src/ScadaLink.TemplateEngine`). -Each module has its own folder under `code-reviews/` containing a single `findings.md`. +A **module** is one buildable project under `src/` (e.g. `src/MxGateway.Worker`). +Each module has its own folder under `code-reviews/` containing a single +`findings.md`. ## 1. Before you start -1. Pick the module to review. Its folder is `code-reviews//` where `` - is the project name with the `ScadaLink.` prefix stripped. +1. Pick the module to review. Its folder is `code-reviews//` where + `` is the project name with the `MxGateway.` prefix stripped — so + `src/MxGateway.Server` is reviewed in `code-reviews/Server/`. 2. Identify the design context for the module: - - Its component design doc: `docs/requirements/Component-.md`. - - The relevant **Key Design Decisions** in `CLAUDE.md`. - - `docs/requirements/HighLevelReqs.md` for cross-cutting requirements. -3. Record the exact commit being reviewed: `git rev-parse --short HEAD`. Every review - is a snapshot — a finding only means something relative to a known commit. + - `gateway.md` — top-level architecture, command/event surface, IPC envelope, + STA thread model, fault handling. + - The relevant component design docs under `docs/` (e.g. + `docs/MxAccessWorkerInstanceDesign.md`, `docs/GatewayProcessDesign.md`, + `docs/Sessions.md`, `docs/Authentication.md`, `docs/GalaxyRepository.md`). + - `docs/DesignDecisions.md` for the v1 design choices. + - The **Repository-Specific Conventions** and **Process / Platform Notes** in + `CLAUDE.md`. +3. Record the exact commit being reviewed: `git rev-parse --short HEAD`. Every + review is a snapshot — a finding only means something relative to a known + commit. 4. Open `code-reviews//findings.md` and fill in the header table - (reviewer, date, commit SHA). + (reviewer, date, commit SHA, status). ## 2. Review checklist -Work through **every** category below for the module. A comprehensive review means -the checklist is completed even where it produces no findings — record "No issues -found" for a category rather than leaving it ambiguous. +Work through **every** category below for the module. A comprehensive review +means the checklist is completed even where it produces no findings — record +"No issues found" for a category rather than leaving it ambiguous. -1. **Correctness & logic bugs** — off-by-one, null handling, incorrect conditionals, - misuse of APIs, broken edge cases. -2. **Akka.NET conventions** — supervision strategies (Resume for coordinators, Stop - for short-lived actors), `Tell` for hot paths / `Ask` only at system boundaries, - message immutability, no blocking on non-blocking dispatchers, no `sender`/`this` - captured in closures (`PipeTo` instead), correlation IDs on request/response. -3. **Concurrency & thread safety** — shared mutable state, actor state mutated only - on the actor thread, race conditions, correct use of async/await. -4. **Error handling & resilience** — exception paths, store-and-forward integration, - reconnect/retry logic, failover behaviour, transient vs permanent error - classification, graceful degradation. -5. **Security** — authentication/authorization checks, input validation, the script - trust model (forbidden APIs: `System.IO`, `Process`, `Threading`, `Reflection`, - raw network), secret handling, SQL/LDAP injection, logging of sensitive data. -6. **Performance & resource management** — `IDisposable` disposal, stream/connection - lifetimes, buffering and back-pressure, unnecessary allocations, N+1 queries. -7. **Design-document adherence** — does the code match `Component-.md` and the - relevant CLAUDE.md decisions? Flag both code that drifts from the design and design - docs that are now stale. -8. **Code organization & conventions** — persistence-ignorant POCO entities in - Commons, repository interfaces in Commons / implementations in ConfigurationDatabase, - namespace hierarchy, Options pattern (options classes owned by component projects), - additive-only message contract evolution. -9. **Testing coverage** — are the module's behaviours covered by tests in `tests/`? - Note untested critical paths and missing edge-case tests. +1. **Correctness & logic bugs** — off-by-one, null handling, incorrect + conditionals, misuse of APIs, broken edge cases. +2. **mxaccessgw conventions** — the rules in `CLAUDE.md` and the style guides + under `docs/style-guides/`: the gateway never instantiates MXAccess COM + directly; all MXAccess COM calls run on the worker's dedicated STA thread and + the STA loop pumps Windows messages; IPC uses one bidirectional named pipe per + worker carrying length-prefixed `WorkerEnvelope` protobuf frames; MXAccess + parity is the contract (don't "fix" surprising MXAccess behaviour, never + synthesize events); one worker and one event subscriber per session; the + gateway terminates orphan workers on startup and does not reattach; C# style + (file-scoped namespaces, `sealed` by default, `Async` suffix, MXAccess-aligned + names); no Blazor UI component libraries; no logging of secrets or full tag + values; generated code is never hand-edited. +3. **Concurrency & thread safety** — shared mutable state, STA affinity, race + conditions, correct use of `async`/`await`, locking, disposal races. +4. **Error handling & resilience** — exception paths, worker crash / reconnect + handling, fail-fast event backpressure, transient vs permanent error + classification, graceful degradation, correct gRPC status codes. +5. **Security** — authentication/authorization checks, API-key scope enforcement, + input validation, SQL injection in the Galaxy Repository RPCs, secret + handling, the dashboard anonymous-localhost bypass, logging of sensitive data. +6. **Performance & resource management** — `IDisposable` disposal, pipe / stream + / COM lifetimes, buffering and back-pressure, unnecessary allocations on hot + paths, N+1 queries. +7. **Design-document adherence** — does the code match `gateway.md`, the relevant + `docs/` component designs, `docs/DesignDecisions.md`, and `CLAUDE.md`? Flag + both code that drifts from the design and design docs that are now stale. +8. **Code organization & conventions** — namespace hierarchy, project layout, the + Options pattern, separation of concerns, additive-only contract evolution. +9. **Testing coverage** — are the module's behaviours covered by tests + (`src/MxGateway.Tests`, `src/MxGateway.Worker.Tests`, + `src/MxGateway.IntegrationTests`)? Note untested critical paths and missing + edge-case tests. 10. **Documentation & comments** — XML doc accuracy, misleading or stale comments, undocumented non-obvious behaviour. ## 3. Recording findings -Add one entry per finding to the `## Findings` section of the module's `findings.md`, -using the entry format in [`_template/findings.md`](_template/findings.md). +Add one entry per finding to the `## Findings` section of the module's +`findings.md`, using the entry format in +[`_template/findings.md`](code-reviews/_template/findings.md). -- **Finding ID** — `-NNN`, numbered sequentially within the module and never - reused (e.g. `TemplateEngine-001`). IDs are permanent even after resolution. +- **Finding ID** — `-NNN`, numbered sequentially within the module and + never reused (e.g. `Worker-001`). IDs are permanent even after resolution. - **Severity:** - - **Critical** — data loss, security breach, crash/deadlock, or cluster-wide outage. + - **Critical** — data loss, security breach, crash/deadlock, or outage. - **High** — incorrect behaviour with significant impact; no safe workaround. - **Medium** — incorrect or risky behaviour with limited impact or a workaround. - **Low** — minor issues, style, maintainability, documentation. @@ -70,44 +87,52 @@ using the entry format in [`_template/findings.md`](_template/findings.md). - **Description** — what is wrong and why it matters. - **Recommendation** — concrete suggested fix. -After recording findings, update the module header table (status, open-finding count) -and refresh the base README (step 5). +After recording findings, update the module header table (status, open-finding +count) and regenerate the base README (step 5). ## 4. Marking an item resolved -Findings are **never deleted** — they are an audit trail. To close one, change its -**Status** and complete the **Resolution** field: +Findings are **never deleted** — they are an audit trail. To close one, change +its **Status** and complete the **Resolution** field: - `Open` — newly recorded, not yet addressed. - `In Progress` — a fix is actively being worked on. - `Resolved` — fixed. The Resolution field must state the fixing commit SHA, the date, and a one-line description of the fix. - `Won't Fix` — intentionally not fixed. The Resolution field must justify why. -- `Deferred` — valid but postponed. The Resolution field must say what it is waiting - on (e.g. a tracked issue or a later milestone). +- `Deferred` — valid but postponed. The Resolution field must say what it is + waiting on (e.g. a tracked issue or a later milestone). -`Resolved`, `Won't Fix`, and `Deferred` findings are all considered **closed** and -drop off the base README's pending list. `Open` and `In Progress` are **pending**. +`Resolved`, `Won't Fix`, and `Deferred` findings are all considered **closed**. +`Open` and `In Progress` are **pending** and appear in the base README's Pending +Findings table. ## 5. Updating the base README -`code-reviews/README.md` holds the single cross-module view (process overview, the -Pending Findings tables, and the Module Status table). It is **generated** from the +`code-reviews/README.md` holds the single cross-module view (the Module Status +table and the Pending / Closed Findings tables). It is **generated** from the per-module `findings.md` files — do not edit it by hand. After any review or status change, regenerate it: ``` -python3 code-reviews/regen-readme.py +python code-reviews/regen-readme.py ``` -`regen-readme.py --check` exits non-zero if `README.md` is stale, for use in CI. +`regen-readme.py --check` exits non-zero if `README.md` is stale, if a module +header's `Open findings` count disagrees with its finding statuses, or if a +finding carries an unrecognised Status value. The PowerShell wrapper +`scripts/check-code-reviews-readme.ps1` runs that check and is the intended hook +for CI or a pre-commit step. + +> The repo's installed `python` is the real interpreter; the bare `python3` +> alias resolves to the Windows Store stub and fails. Use `python`. The per-module `findings.md` files are the source of truth; `README.md` is the aggregated index and must always agree with them — which the script guarantees. ## 6. Re-reviewing a module -Re-reviews append to the same `findings.md`. Update the header to the new commit and -date, continue the finding numbering from the last used ID, and leave prior findings -(including closed ones) in place as history. +Re-reviews append to the same `findings.md`. Update the header to the new commit +and date, continue the finding numbering from the last used ID, and leave prior +findings (including closed ones) in place as history. diff --git a/code-reviews/README.md b/code-reviews/README.md index 709a2a1..2cd0f36 100644 --- a/code-reviews/README.md +++ b/code-reviews/README.md @@ -1,6 +1,6 @@ # Code Reviews - + Cross-module code review index for the `mxaccessgw` codebase. The review process is defined in [../REVIEW-PROCESS.md](../REVIEW-PROCESS.md). diff --git a/code-reviews/regen-readme.py b/code-reviews/regen-readme.py index 9a73eaf..44a89f8 100644 --- a/code-reviews/regen-readme.py +++ b/code-reviews/regen-readme.py @@ -6,8 +6,12 @@ them into the single cross-module README.md (module status + pending/closed finding tables). Usage: - python3 code-reviews/regen-readme.py # rewrite README.md - python3 code-reviews/regen-readme.py --check # exit 1 if README.md is stale + python code-reviews/regen-readme.py # rewrite README.md + python code-reviews/regen-readme.py --check # exit 1 if stale or inconsistent + +`--check` fails when README.md is out of date OR when a module's header +`Open findings` count disagrees with its finding statuses, or a finding +carries an unrecognised Status value. """ from __future__ import annotations @@ -19,11 +23,12 @@ ROOT = Path(__file__).resolve().parent README = ROOT / "README.md" PENDING_STATUSES = {"Open", "In Progress"} +KNOWN_STATUSES = {"Open", "In Progress", "Resolved", "Won't Fix", "Deferred"} SEVERITY_ORDER = {"Critical": 0, "High": 1, "Medium": 2, "Low": 3} GENERATED_NOTE = ( "" + "Regenerate with: python code-reviews/regen-readme.py -->" ) @@ -169,6 +174,32 @@ def build_readme(modules: list[dict]) -> str: return "\n".join(out) + "\n" +def find_inconsistencies(modules: list[dict]) -> list[str]: + """Return human-readable problems in the per-module findings.md files. + + Checks that each module header's `Open findings` count agrees with its + finding statuses, and that every finding carries a known Status value. + """ + issues: list[str] = [] + for m in modules: + open_n = sum( + 1 for f in m["findings"] if f["status"] in PENDING_STATUSES + ) + declared = m["header"].get("Open findings", "").strip() + if declared != str(open_n): + issues.append( + f"{m['module']}: header 'Open findings' = '{declared}' but " + f"{open_n} finding(s) are Open/In Progress" + ) + for f in m["findings"]: + if f["status"] not in KNOWN_STATUSES: + issues.append( + f"{m['module']}: finding {f['id']} has unrecognised " + f"Status '{f['status']}'" + ) + return issues + + def main(argv: list[str]) -> int: check = "--check" in argv[1:] module_dirs = sorted( @@ -178,16 +209,24 @@ def main(argv: list[str]) -> int: ) modules = [parse_module(d / "findings.md") for d in module_dirs] content = build_readme(modules) + issues = find_inconsistencies(modules) if check: - current = README.read_text(encoding="utf-8") if README.exists() else "" - if current != content: + stale = ( + README.read_text(encoding="utf-8") if README.exists() else "" + ) != content + for issue in issues: + print(f"inconsistent: {issue}", file=sys.stderr) + if stale: print( "code-reviews/README.md is stale - run regen-readme.py", file=sys.stderr, ) + if stale or issues: return 1 - print("code-reviews/README.md is up to date.") + print("code-reviews/README.md is up to date and consistent.") return 0 + for issue in issues: + print(f"warning: {issue}", file=sys.stderr) README.write_text(content, encoding="utf-8", newline="\n") print(f"Wrote {README} ({len(modules)} modules).") return 0 diff --git a/code-reviews/test_regen_readme.py b/code-reviews/test_regen_readme.py new file mode 100644 index 0000000..b7c3055 --- /dev/null +++ b/code-reviews/test_regen_readme.py @@ -0,0 +1,158 @@ +#!/usr/bin/env python3 +"""Tests for regen-readme.py. + +Dependency-free: run with `python code-reviews/test_regen_readme.py`. +Exits 0 if all tests pass, 1 otherwise. +""" +from __future__ import annotations + +import importlib.util +import tempfile +import traceback +from pathlib import Path + +HERE = Path(__file__).resolve().parent + +# regen-readme.py is not an importable module name (hyphen), so load it by path. +_spec = importlib.util.spec_from_file_location("regen_readme", HERE / "regen-readme.py") +regen = importlib.util.module_from_spec(_spec) +_spec.loader.exec_module(regen) + +FIXTURE = """# Code Review — Demo + +| Field | Value | +|---|---| +| Module | `src/Demo` | +| Reviewer | Tester | +| Review date | 2026-05-18 | +| Commit reviewed | `abc1234` | +| Status | Reviewed | +| Open findings | 1 | + +## Findings + +### Demo-001 + +| Field | Value | +|---|---| +| Severity | High | +| Category | Security | +| Location | `src/Demo/File.cs:10` | +| Status | Open | + +**Description:** A first problem that matters. + +**Recommendation:** Fix it. + +**Resolution:** _(open)_ + +### Demo-002 + +| Field | Value | +|---|---| +| Severity | Low | +| Category | Documentation & comments | +| Location | `src/Demo/File.cs:20` | +| Status | Resolved | + +**Description:** A second, minor problem. + +**Recommendation:** Tidy it. + +**Resolution:** Fixed in def5678 on 2026-05-18. +""" + + +def _parse_fixture() -> dict: + """Write FIXTURE to a temp Demo/findings.md and parse it.""" + with tempfile.TemporaryDirectory() as tmp: + path = Path(tmp) / "Demo" / "findings.md" + path.parent.mkdir() + path.write_text(FIXTURE, encoding="utf-8") + return regen.parse_module(path) + + +def test_first_table_skips_separator_and_field_header(): + table = regen.first_table("| Field | Value |\n|---|---|\n| Severity | High |\n") + assert table == {"Severity": "High"}, table + + +def test_parse_module_header(): + m = _parse_fixture() + assert m["module"] == "Demo", m["module"] + assert m["header"]["Reviewer"] == "Tester" + assert m["header"]["Status"] == "Reviewed" + assert m["header"]["Open findings"] == "1" + + +def test_parse_module_findings(): + m = _parse_fixture() + assert len(m["findings"]) == 2, len(m["findings"]) + first = m["findings"][0] + assert first["id"] == "Demo-001" + assert first["severity"] == "High" + assert first["category"] == "Security" + assert first["location"] == "`src/Demo/File.cs:10`" + assert first["status"] == "Open" + assert first["description"] == "A first problem that matters." + assert m["findings"][1]["status"] == "Resolved" + + +def test_build_readme_splits_pending_and_closed(): + readme = regen.build_readme([_parse_fixture()]) + assert "## Pending findings" in readme + assert "## Closed findings" in readme + pending, closed = readme.split("## Closed findings", 1) + assert "Demo-001" in pending # Open -> pending + assert "Demo-001" not in closed + assert "Demo-002" in closed # Resolved -> closed + assert "_No pending findings._" not in pending + + +def test_find_inconsistencies_clean_fixture(): + assert regen.find_inconsistencies([_parse_fixture()]) == [] + + +def test_find_inconsistencies_detects_wrong_open_count(): + m = _parse_fixture() + m["header"]["Open findings"] = "7" + issues = regen.find_inconsistencies([m]) + assert len(issues) == 1 and "Open findings" in issues[0], issues + + +def test_find_inconsistencies_detects_unknown_status(): + m = _parse_fixture() + m["findings"][0]["status"] = "Bogus" + issues = regen.find_inconsistencies([m]) + # Wrong status also shifts the open count, so expect the status issue present. + assert any("unrecognised Status" in i for i in issues), issues + + +def test_summarize_truncates_long_text(): + long = "x" * 500 + out = regen.summarize(long) + assert len(out) <= 240 and out.endswith("…"), len(out) + assert regen.summarize("short") == "short" + + +def main() -> int: + tests = sorted( + (name, fn) + for name, fn in globals().items() + if name.startswith("test_") and callable(fn) + ) + failed = 0 + for name, fn in tests: + try: + fn() + print(f"PASS {name}") + except Exception: # noqa: BLE001 - test runner reports all failures + failed += 1 + print(f"FAIL {name}") + traceback.print_exc() + print(f"\n{len(tests) - failed}/{len(tests)} passed.") + return 1 if failed else 0 + + +if __name__ == "__main__": + raise SystemExit(main()) diff --git a/scripts/check-code-reviews-readme.ps1 b/scripts/check-code-reviews-readme.ps1 new file mode 100644 index 0000000..78f5a65 --- /dev/null +++ b/scripts/check-code-reviews-readme.ps1 @@ -0,0 +1,20 @@ +# Verifies code-reviews/README.md is regenerated from, and consistent with, the +# per-module findings.md files. Intended as a CI / pre-commit gate. +# +# Exits non-zero when README.md is stale, when a module header's "Open findings" +# count disagrees with its finding statuses, or when a finding carries an +# unrecognised Status value. See REVIEW-PROCESS.md section 5. + +[CmdletBinding()] +param() + +Set-StrictMode -Version Latest +$ErrorActionPreference = "Stop" + +$repoRoot = Resolve-Path (Join-Path $PSScriptRoot "..") +$script = Join-Path $repoRoot "code-reviews/regen-readme.py" + +# The bare `python3` alias on this platform resolves to the Windows Store stub; +# `python` is the real interpreter. +& python $script --check +exit $LASTEXITCODE