d2136cacf7
- Move file size read after streams are disposed to get accurate compressed size - Clean up definition files to use working example queries - Add .gitignore for output directory
3.6 KiB
3.6 KiB
DbExporter Tool Design
Purpose
A command-line tool that queries databases (SQL Server or Oracle) and exports results to compressed protobuf files using protobuf-net-data and zstd compression.
CLI Interface
Usage: DbExporter <definition-file> [options]
Arguments:
definition-file Path to JSON definition file
Options:
--verify Verify output (row count + schema)
--verify-full Verify output with SHA256 checksum
--help Show help
Examples:
# Export data
dotnet run -- ./definitions/scada-clients.json
# Export and verify
dotnet run -- ./definitions/scada-clients.json --verify
# Full verification with checksum
dotnet run -- ./definitions/scada-clients.json --verify-full
Definition File Format (JSON)
{
"providerType": "SqlServer",
"connectionString": "Server=...;Database=...;User Id=...;Password=...;",
"query": "SELECT * FROM MyTable",
"outputPath": "./output/mytable.pb.zstd",
"compressionLevel": 10
}
| Field | Required | Default | Description |
|---|---|---|---|
providerType |
Yes | - | "SqlServer" or "Oracle" |
connectionString |
Yes | - | ADO.NET connection string |
query |
Yes | - | SQL query to execute |
outputPath |
Yes | - | Output file path (.pb.zstd) |
compressionLevel |
No | 10 |
Zstd level 1-19 (higher = smaller, slower) |
Core Workflow
Export Flow
- Parse definition file (JSON)
- Validate fields (provider type, connection string, query)
- Create appropriate DbConnection (SqlConnection or OracleConnection)
- Execute query → IDataReader
- Serialize IDataReader → protobuf stream (via protobuf-net-data)
- Compress protobuf stream → zstd (via ZstdSharp)
- While writing, compute SHA256 incrementally
- Write to output file + sidecar .sha256 file
- Print summary: row count, file size, compression ratio
Verify Flow (--verify)
- Open output file
- Decompress zstd → protobuf stream
- Deserialize protobuf → IDataReader
- Loop through all rows, count them (streaming)
- Extract schema (column names + types)
- Print: ✓ row count, schema
Verify-Full Flow (--verify-full)
- Open output file
- Decompress zstd → stream protobuf data
- While streaming: count rows, extract schema, compute SHA256 incrementally
- Compare computed SHA256 to stored sidecar file
- Print: ✓ row count, schema, checksum match/mismatch
Project Structure
Tools/DbExporter/
├── DbExporter.csproj
├── Program.cs # CLI entry point, argument parsing
├── ExportDefinition.cs # JSON model for definition file
├── DatabaseExporter.cs # Core export logic
└── Verifier.cs # Verify and verify-full logic
Dependencies
| Package | Purpose |
|---|---|
protobuf-net-data |
Serialize IDataReader to protobuf |
ZstdSharp.Port |
Zstd compression |
Microsoft.Data.SqlClient |
SQL Server connectivity |
Oracle.ManagedDataAccess.Core |
Oracle connectivity |
System.Text.Json |
Parse definition files |
Target Framework: net10.0
Testing with ScadaBridge
Connection:
Server=10.100.0.35;Database=ScadaBridge_Test;User Id=sa;Password=ScadaBridge2024;TrustServerCertificate=true;
Definition files will be created in Tools/DbExporter/definitions/ for ScadaBridge tables.
Test approach:
- Build the tool
- Run export for each definition file
- Run
--verifyto confirm row counts and schemas - Run
--verify-fullon at least one to confirm checksum works