natsnet/docs/plans/phases/phase-2-verification.md

# Phase 2: Verification of Captured Items

## Objective

Verify that the Phase 1 decomposition captured every Go source file, function,
test, and dependency accurately. Compare database counts against independent
baselines derived directly from the filesystem. Identify and fix any gaps before
proceeding to library mapping and porting.

## Prerequisites

- Phase 1 is complete (`porting.db` is populated).
- The Go source at `golang/nats-server/` has not changed since the Phase 1
  analyzer run. If the source was updated, re-run the Phase 1 analyzer first.
- `dotnet`, `sqlite3`, `find`, `grep`, and `wc` are available on your PATH.

## Source and Target Locations

| Component | Path |
|---|---|
| Go source code | `golang/` (specifically `golang/nats-server/`) |
| .NET ported version | `dotnet/` |

## Steps

### Step 1: Generate the summary report

Start with a high-level view of what the database contains:

```bash
dotnet run --project tools/NatsNet.PortTracker -- report summary --db porting.db
```

Record the counts for modules, features, unit tests, and library mappings. These
are the numbers you will verify in subsequent steps.

### Step 2: Count Go source files on disk

Count non-test `.go` files under the server directory (the scope of the analyzer):

```bash
find golang/nats-server/server -name "*.go" ! -name "*_test.go" ! -path "*/configs/*" ! -path "*/testdata/*" | wc -l
```

This should produce approximately 109 files. Compare this count against the
number of distinct `go_file` values in the features table:

```bash
sqlite3 porting.db "SELECT COUNT(DISTINCT go_file) FROM features;"
```

If the database count is lower, some source files may have been skipped. Check
the analyzer stderr output for warnings, or list the missing files:

```bash
sqlite3 porting.db "SELECT DISTINCT go_file FROM features ORDER BY go_file;" > /tmp/db_files.txt
find golang/nats-server/server -name "*.go" ! -name "*_test.go" ! -path "*/configs/*" ! -path "*/testdata/*" -exec realpath --relative-to=golang/nats-server {} \; | sort > /tmp/disk_files.txt
diff /tmp/db_files.txt /tmp/disk_files.txt
```

### Step 3: Count Go test files on disk

```bash
find golang/nats-server/server -name "*_test.go" ! -path "*/configs/*" ! -path "*/testdata/*" | wc -l
```

This should produce approximately 85 files. Compare against distinct test files
in the database:

```bash
sqlite3 porting.db "SELECT COUNT(DISTINCT go_file) FROM unit_tests;"
```

### Step 4: Compare function counts

Count all exported and unexported functions in source files on disk:

```bash
grep -r "^func " golang/nats-server/server/ --include="*.go" --exclude="*_test.go" | grep -v "/configs/" | grep -v "/testdata/" | wc -l
```

Compare against the features count from the database:

```bash
sqlite3 porting.db "SELECT COUNT(*) FROM features;"
```

The numbers should be close. Small discrepancies can occur because:
- The `grep` approach counts lines starting with `func` which may miss functions
  with preceding comments on the same line or multi-line signatures.
- The AST parser used by the analyzer is more accurate; it finds all `func`
  declarations regardless of formatting.

If the database count is significantly lower (more than 5% off), investigate.

### Step 5: Compare test function counts

Count test functions on disk:

```bash
grep -r "^func Test" golang/nats-server/server/ --include="*_test.go" | wc -l
```

Also count benchmarks:

```bash
grep -r "^func Benchmark" golang/nats-server/server/ --include="*_test.go" | wc -l
```

Compare the combined total against the unit_tests table:

```bash
sqlite3 porting.db "SELECT COUNT(*) FROM unit_tests;"
```

### Step 6: Run the phase check command

The PortTracker has a built-in Phase 1 checklist that verifies all tables are
populated:

```bash
dotnet run --project tools/NatsNet.PortTracker -- phase check 1 --db porting.db
```

All items except "All libraries mapped" should show `[x]`.

### Step 7: Check for orphaned items

Look for features that are not linked to any module (should be zero):

```bash
sqlite3 porting.db "SELECT COUNT(*) FROM features WHERE module_id NOT IN (SELECT id FROM modules);"
```

Look for tests that are not linked to any module (should be zero):

```bash
sqlite3 porting.db "SELECT COUNT(*) FROM unit_tests WHERE module_id NOT IN (SELECT id FROM modules);"
```

Look for test-to-feature links that point to non-existent features:

```bash
sqlite3 porting.db "SELECT COUNT(*) FROM unit_tests WHERE feature_id IS NOT NULL AND feature_id NOT IN (SELECT id FROM features);"
```

Look for dependencies that reference non-existent source or target items:

```bash
sqlite3 porting.db "
SELECT COUNT(*) FROM dependencies
WHERE (source_type = 'module' AND source_id NOT IN (SELECT id FROM modules))
   OR (target_type = 'module' AND target_id NOT IN (SELECT id FROM modules))
   OR (source_type = 'feature' AND source_id NOT IN (SELECT id FROM features))
   OR (target_type = 'feature' AND target_id NOT IN (SELECT id FROM features))
   OR (source_type = 'unit_test' AND source_id NOT IN (SELECT id FROM unit_tests))
   OR (target_type = 'unit_test' AND target_id NOT IN (SELECT id FROM unit_tests));
"
```

All of these queries should return 0.

### Step 8: Review the largest modules

The largest modules are the most likely to have issues. List modules sorted by
feature count:

```bash
sqlite3 porting.db "
SELECT m.id, m.name, m.go_line_count,
       COUNT(f.id) as feature_count
FROM modules m
LEFT JOIN features f ON f.module_id = m.id
GROUP BY m.id
ORDER BY feature_count DESC
LIMIT 10;
"
```

For each of the top 3 modules, do a manual spot-check:

```bash
dotnet run --project tools/NatsNet.PortTracker -- module show <id> --db porting.db
```

Scroll through the features list and verify that the functions look correct
(check a few against the actual Go source file).

### Step 9: Validate the dependency graph

Check for any circular module dependencies (modules that depend on each other):

```bash
sqlite3 porting.db "
SELECT d1.source_id, d1.target_id
FROM dependencies d1
JOIN dependencies d2
  ON d1.source_type = d2.target_type AND d1.source_id = d2.target_id
 AND d1.target_type = d2.source_type AND d1.target_id = d2.source_id
WHERE d1.source_type = 'module' AND d1.target_type = 'module';
"
```

Circular dependencies are not necessarily wrong (Go packages can have them via
interfaces), but they should be reviewed.

Check which items are blocked by unported dependencies:

```bash
dotnet run --project tools/NatsNet.PortTracker -- dependency blocked --db porting.db
```

And confirm that at least some items are ready to port (have no unported deps):

```bash
dotnet run --project tools/NatsNet.PortTracker -- dependency ready --db porting.db
```

### Step 10: Verify library import completeness

Ensure every external import found in the source is tracked:

```bash
sqlite3 porting.db "SELECT COUNT(*) FROM library_mappings;"
```

Cross-check against a manual count of unique non-stdlib imports:

```bash
grep -rh "\"" golang/nats-server/server/ --include="*.go" | \
  grep -oP '"\K[^"]+' | \
  grep '\.' | \
  sort -u | \
  wc -l
```

This is an approximate check. The AST-based analyzer is more accurate than grep
for import extraction, but the numbers should be in the same ballpark.

### Step 11: Export a verification snapshot

Save the current state as a markdown report for your records:

```bash
dotnet run --project tools/NatsNet.PortTracker -- report export \
  --format md \
  --output docs/reports/phase-2-verification.md \
  --db porting.db
```

## Completion Criteria

Phase 2 is complete when ALL of the following are true:

- [ ] Source file counts on disk match distinct `go_file` counts in the database
      (within a small margin for intentionally excluded directories).
- [ ] Feature counts from `grep` are within 5% of the database count (AST is the
      authoritative source).
- [ ] Test function counts from `grep` match the database count closely.
- [ ] No orphaned features (all linked to valid modules).
- [ ] No orphaned tests (all linked to valid modules).
- [ ] No broken test-to-feature links.
- [ ] No dangling dependency references.
- [ ] Dependency graph is reviewed -- circular deps (if any) are acknowledged.
- [ ] `dependency ready` returns at least one item (the graph has valid roots).
- [ ] Library mappings table contains all external imports.
- [ ] `phase check 1` passes with all items except "All libraries mapped" checked.

## Troubleshooting

### File count mismatch is large

If the disk file count exceeds the database count by more than a few files,
re-run the analyzer with stderr visible:

```bash
./tools/go-analyzer/go-analyzer \
  --source golang/nats-server \
  --db porting.db \
  --schema porting-schema.sql 2>&1 | tee /tmp/analyzer.log
```

Search for warnings:

```bash
grep "Warning" /tmp/analyzer.log
```

Common causes:
- Files with build tags that prevent parsing (e.g., `//go:build ignore`).
- Files in excluded directories (`configs/`, `testdata/`).
- Syntax errors in Go files that the parser cannot handle.

### Feature count is significantly different

The AST parser counts every `func` declaration, including unexported helper
functions. The `grep` baseline only matches lines starting with `func `. If
features that have multiline signatures like:

```go
func (s *Server) handleConnection(
    conn net.Conn,
) {
```

...they will be missed by grep but found by the AST parser. Trust the database
count as authoritative.

### Orphaned records found

If orphaned records exist, the analyzer may have a bug or the database was
partially populated from a prior run. The safest fix is to:

1. Delete the database: `rm porting.db`
2. Re-run Phase 1 from Step 1.

### Tests not linked to features

The analyzer uses naming conventions to link tests to features (e.g.,
`TestConnect` maps to a feature containing `Connect`). If many tests show
`feature_id = NULL`, this is expected for tests whose names do not follow the
convention. These links can be manually added later if needed.