Files
natsdotnet/docs/plans/2026-03-14-dtp-parser-design.md

76 lines
2.9 KiB
Markdown
Raw Permalink Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# dotTrace DTP Parser Design
**Goal:** Build a repository-local tool that starts from a raw dotTrace `.dtp` snapshot family and emits machine-readable JSON call-tree data suitable for LLM-driven hotspot analysis.
**Context**
The target snapshot format is JetBrains dotTrace multi-file storage:
- `snapshot.dtp` is the index/manifest.
- `snapshot.dtp.0000`, `.0001`, and related files hold the storage sections.
- `snapshot.dtp.States` holds UI state and is not sufficient for call-tree analysis.
The internal binary layout is not publicly specified. A direct handwritten decoder would be brittle and expensive to maintain. The machine already has dotTrace installed, and the shipped JetBrains assemblies expose snapshot storage, metadata, and performance call-tree readers. The design therefore uses dotTraces local runtime libraries as the authoritative decoder while still starting from the raw `.dtp` files.
**Architecture**
Two layers:
1. A small .NET helper opens the raw snapshot, reads the performance DFS call-tree and node payload sections, resolves function names through the profiler metadata section, and emits JSON.
2. A Python CLI is the user-facing entrypoint. It validates input, builds or reuses the helper, runs it, and writes JSON to stdout or a file.
This keeps the user workflow Python-first while using the only reliable decoder available for the undocumented snapshot format.
**Output schema**
The JSON should support both direct consumption and downstream summarization:
- `snapshot`: source path, thread count, node count, payload type.
- `thread_roots`: thread root metadata.
- `call_tree`: synthetic root with recursive children.
- `hotspots`: flat top lists for inclusive and exclusive time.
Each node should include:
- `id`: stable offset-based identifier.
- `name`: resolved method or synthetic node name.
- `kind`: `root`, `thread`, `method`, or `special`.
- `inclusive_time`
- `exclusive_time`
- `call_count`
- `thread_name` when relevant
- `children`
**Resolution strategy**
Method names are resolved from the snapshots metadata section:
- Use the snapshots FUID-to-metadata converter.
- Map `FunctionUID` to `FunctionId`.
- Resolve `MetadataId`.
- Read function and class data with `MetadataSectionHelpers`.
Synthetic and special frames fall back to explicit labels instead of opaque numeric values where possible.
**Error handling**
The tool should fail loudly for the cases that matter:
- Missing dotTrace assemblies.
- Unsupported snapshot layout.
- Missing metadata sections.
- Helper build or execution failure.
Errors should name the failing stage so the Python wrapper can surface actionable messages.
**Testing**
Use the checked-in sample snapshot at `snapshots/js-ordered-consume.dtp` for an end-to-end test:
- JSON parses successfully.
- The root contains thread children.
- Hotspot lists are populated.
- At least one non-special method name is resolved.
This is enough to verify the extraction path without freezing the entire output.