jdescopingtool/PLANS/2026-01-06-dbexporter-design.md

# DbExporter Tool Design

## Purpose

A command-line tool that queries databases (SQL Server or Oracle) and exports results to compressed protobuf files using `protobuf-net-data` and zstd compression.

## CLI Interface

```
Usage: DbExporter <definition-file> [options]

Arguments:
  definition-file    Path to JSON definition file

Options:
  --verify          Verify output (row count + schema)
  --verify-full     Verify output with SHA256 checksum
  --help            Show help
```

**Examples:**
```bash
# Export data
dotnet run -- ./definitions/scada-clients.json

# Export and verify
dotnet run -- ./definitions/scada-clients.json --verify

# Full verification with checksum
dotnet run -- ./definitions/scada-clients.json --verify-full
```

## Definition File Format (JSON)

```json
{
  "providerType": "SqlServer",
  "connectionString": "Server=...;Database=...;User Id=...;Password=...;",
  "query": "SELECT * FROM MyTable",
  "outputPath": "./output/mytable.pb.zstd",
  "compressionLevel": 10
}
```

| Field | Required | Default | Description |
|-------|----------|---------|-------------|
| `providerType` | Yes | - | `"SqlServer"` or `"Oracle"` |
| `connectionString` | Yes | - | ADO.NET connection string |
| `query` | Yes | - | SQL query to execute |
| `outputPath` | Yes | - | Output file path (.pb.zstd) |
| `compressionLevel` | No | `10` | Zstd level 1-19 (higher = smaller, slower) |

## Core Workflow

### Export Flow
1. Parse definition file (JSON)
2. Validate fields (provider type, connection string, query)
3. Create appropriate DbConnection (SqlConnection or OracleConnection)
4. Execute query → IDataReader
5. Serialize IDataReader → protobuf stream (via protobuf-net-data)
6. Compress protobuf stream → zstd (via ZstdSharp)
7. While writing, compute SHA256 incrementally
8. Write to output file + sidecar .sha256 file
9. Print summary: row count, file size, compression ratio

### Verify Flow (--verify)
1. Open output file
2. Decompress zstd → protobuf stream
3. Deserialize protobuf → IDataReader
4. Loop through all rows, count them (streaming)
5. Extract schema (column names + types)
6. Print: ✓ row count, schema

### Verify-Full Flow (--verify-full)
1. Open output file
2. Decompress zstd → stream protobuf data
3. While streaming: count rows, extract schema, compute SHA256 incrementally
4. Compare computed SHA256 to stored sidecar file
5. Print: ✓ row count, schema, checksum match/mismatch

## Project Structure

```
Tools/DbExporter/
├── DbExporter.csproj
├── Program.cs              # CLI entry point, argument parsing
├── ExportDefinition.cs     # JSON model for definition file
├── DatabaseExporter.cs     # Core export logic
└── Verifier.cs             # Verify and verify-full logic
```

## Dependencies

| Package | Purpose |
|---------|---------|
| `protobuf-net-data` | Serialize IDataReader to protobuf |
| `ZstdSharp.Port` | Zstd compression |
| `Microsoft.Data.SqlClient` | SQL Server connectivity |
| `Oracle.ManagedDataAccess.Core` | Oracle connectivity |
| `System.Text.Json` | Parse definition files |

**Target Framework:** `net10.0`

## Testing with ScadaBridge

**Connection:**
```
Server=10.100.0.35;Database=ScadaBridge_Test;User Id=sa;Password=ScadaBridge2024;TrustServerCertificate=true;
```

Definition files will be created in `Tools/DbExporter/definitions/` for ScadaBridge tables.

**Test approach:**
1. Build the tool
2. Run export for each definition file
3. Run `--verify` to confirm row counts and schemas
4. Run `--verify-full` on at least one to confirm checksum works