d2136cacf7
- Move file size read after streams are disposed to get accurate compressed size - Clean up definition files to use working example queries - Add .gitignore for output directory
118 lines
3.6 KiB
Markdown
118 lines
3.6 KiB
Markdown
# DbExporter Tool Design
|
|
|
|
## Purpose
|
|
|
|
A command-line tool that queries databases (SQL Server or Oracle) and exports results to compressed protobuf files using `protobuf-net-data` and zstd compression.
|
|
|
|
## CLI Interface
|
|
|
|
```
|
|
Usage: DbExporter <definition-file> [options]
|
|
|
|
Arguments:
|
|
definition-file Path to JSON definition file
|
|
|
|
Options:
|
|
--verify Verify output (row count + schema)
|
|
--verify-full Verify output with SHA256 checksum
|
|
--help Show help
|
|
```
|
|
|
|
**Examples:**
|
|
```bash
|
|
# Export data
|
|
dotnet run -- ./definitions/scada-clients.json
|
|
|
|
# Export and verify
|
|
dotnet run -- ./definitions/scada-clients.json --verify
|
|
|
|
# Full verification with checksum
|
|
dotnet run -- ./definitions/scada-clients.json --verify-full
|
|
```
|
|
|
|
## Definition File Format (JSON)
|
|
|
|
```json
|
|
{
|
|
"providerType": "SqlServer",
|
|
"connectionString": "Server=...;Database=...;User Id=...;Password=...;",
|
|
"query": "SELECT * FROM MyTable",
|
|
"outputPath": "./output/mytable.pb.zstd",
|
|
"compressionLevel": 10
|
|
}
|
|
```
|
|
|
|
| Field | Required | Default | Description |
|
|
|-------|----------|---------|-------------|
|
|
| `providerType` | Yes | - | `"SqlServer"` or `"Oracle"` |
|
|
| `connectionString` | Yes | - | ADO.NET connection string |
|
|
| `query` | Yes | - | SQL query to execute |
|
|
| `outputPath` | Yes | - | Output file path (.pb.zstd) |
|
|
| `compressionLevel` | No | `10` | Zstd level 1-19 (higher = smaller, slower) |
|
|
|
|
## Core Workflow
|
|
|
|
### Export Flow
|
|
1. Parse definition file (JSON)
|
|
2. Validate fields (provider type, connection string, query)
|
|
3. Create appropriate DbConnection (SqlConnection or OracleConnection)
|
|
4. Execute query → IDataReader
|
|
5. Serialize IDataReader → protobuf stream (via protobuf-net-data)
|
|
6. Compress protobuf stream → zstd (via ZstdSharp)
|
|
7. While writing, compute SHA256 incrementally
|
|
8. Write to output file + sidecar .sha256 file
|
|
9. Print summary: row count, file size, compression ratio
|
|
|
|
### Verify Flow (--verify)
|
|
1. Open output file
|
|
2. Decompress zstd → protobuf stream
|
|
3. Deserialize protobuf → IDataReader
|
|
4. Loop through all rows, count them (streaming)
|
|
5. Extract schema (column names + types)
|
|
6. Print: ✓ row count, schema
|
|
|
|
### Verify-Full Flow (--verify-full)
|
|
1. Open output file
|
|
2. Decompress zstd → stream protobuf data
|
|
3. While streaming: count rows, extract schema, compute SHA256 incrementally
|
|
4. Compare computed SHA256 to stored sidecar file
|
|
5. Print: ✓ row count, schema, checksum match/mismatch
|
|
|
|
## Project Structure
|
|
|
|
```
|
|
Tools/DbExporter/
|
|
├── DbExporter.csproj
|
|
├── Program.cs # CLI entry point, argument parsing
|
|
├── ExportDefinition.cs # JSON model for definition file
|
|
├── DatabaseExporter.cs # Core export logic
|
|
└── Verifier.cs # Verify and verify-full logic
|
|
```
|
|
|
|
## Dependencies
|
|
|
|
| Package | Purpose |
|
|
|---------|---------|
|
|
| `protobuf-net-data` | Serialize IDataReader to protobuf |
|
|
| `ZstdSharp.Port` | Zstd compression |
|
|
| `Microsoft.Data.SqlClient` | SQL Server connectivity |
|
|
| `Oracle.ManagedDataAccess.Core` | Oracle connectivity |
|
|
| `System.Text.Json` | Parse definition files |
|
|
|
|
**Target Framework:** `net10.0`
|
|
|
|
## Testing with ScadaBridge
|
|
|
|
**Connection:**
|
|
```
|
|
Server=10.100.0.35;Database=ScadaBridge_Test;User Id=sa;Password=ScadaBridge2024;TrustServerCertificate=true;
|
|
```
|
|
|
|
Definition files will be created in `Tools/DbExporter/definitions/` for ScadaBridge tables.
|
|
|
|
**Test approach:**
|
|
1. Build the tool
|
|
2. Run export for each definition file
|
|
3. Run `--verify` to confirm row counts and schemas
|
|
4. Run `--verify-full` on at least one to confirm checksum works
|