fix(DbExporter): fix compressed size calculation and clean up
- Move file size read after streams are disposed to get accurate compressed size - Clean up definition files to use working example queries - Add .gitignore for output directory
This commit is contained in:
@@ -0,0 +1,117 @@
|
||||
# DbExporter Tool Design
|
||||
|
||||
## Purpose
|
||||
|
||||
A command-line tool that queries databases (SQL Server or Oracle) and exports results to compressed protobuf files using `protobuf-net-data` and zstd compression.
|
||||
|
||||
## CLI Interface
|
||||
|
||||
```
|
||||
Usage: DbExporter <definition-file> [options]
|
||||
|
||||
Arguments:
|
||||
definition-file Path to JSON definition file
|
||||
|
||||
Options:
|
||||
--verify Verify output (row count + schema)
|
||||
--verify-full Verify output with SHA256 checksum
|
||||
--help Show help
|
||||
```
|
||||
|
||||
**Examples:**
|
||||
```bash
|
||||
# Export data
|
||||
dotnet run -- ./definitions/scada-clients.json
|
||||
|
||||
# Export and verify
|
||||
dotnet run -- ./definitions/scada-clients.json --verify
|
||||
|
||||
# Full verification with checksum
|
||||
dotnet run -- ./definitions/scada-clients.json --verify-full
|
||||
```
|
||||
|
||||
## Definition File Format (JSON)
|
||||
|
||||
```json
|
||||
{
|
||||
"providerType": "SqlServer",
|
||||
"connectionString": "Server=...;Database=...;User Id=...;Password=...;",
|
||||
"query": "SELECT * FROM MyTable",
|
||||
"outputPath": "./output/mytable.pb.zstd",
|
||||
"compressionLevel": 10
|
||||
}
|
||||
```
|
||||
|
||||
| Field | Required | Default | Description |
|
||||
|-------|----------|---------|-------------|
|
||||
| `providerType` | Yes | - | `"SqlServer"` or `"Oracle"` |
|
||||
| `connectionString` | Yes | - | ADO.NET connection string |
|
||||
| `query` | Yes | - | SQL query to execute |
|
||||
| `outputPath` | Yes | - | Output file path (.pb.zstd) |
|
||||
| `compressionLevel` | No | `10` | Zstd level 1-19 (higher = smaller, slower) |
|
||||
|
||||
## Core Workflow
|
||||
|
||||
### Export Flow
|
||||
1. Parse definition file (JSON)
|
||||
2. Validate fields (provider type, connection string, query)
|
||||
3. Create appropriate DbConnection (SqlConnection or OracleConnection)
|
||||
4. Execute query → IDataReader
|
||||
5. Serialize IDataReader → protobuf stream (via protobuf-net-data)
|
||||
6. Compress protobuf stream → zstd (via ZstdSharp)
|
||||
7. While writing, compute SHA256 incrementally
|
||||
8. Write to output file + sidecar .sha256 file
|
||||
9. Print summary: row count, file size, compression ratio
|
||||
|
||||
### Verify Flow (--verify)
|
||||
1. Open output file
|
||||
2. Decompress zstd → protobuf stream
|
||||
3. Deserialize protobuf → IDataReader
|
||||
4. Loop through all rows, count them (streaming)
|
||||
5. Extract schema (column names + types)
|
||||
6. Print: ✓ row count, schema
|
||||
|
||||
### Verify-Full Flow (--verify-full)
|
||||
1. Open output file
|
||||
2. Decompress zstd → stream protobuf data
|
||||
3. While streaming: count rows, extract schema, compute SHA256 incrementally
|
||||
4. Compare computed SHA256 to stored sidecar file
|
||||
5. Print: ✓ row count, schema, checksum match/mismatch
|
||||
|
||||
## Project Structure
|
||||
|
||||
```
|
||||
Tools/DbExporter/
|
||||
├── DbExporter.csproj
|
||||
├── Program.cs # CLI entry point, argument parsing
|
||||
├── ExportDefinition.cs # JSON model for definition file
|
||||
├── DatabaseExporter.cs # Core export logic
|
||||
└── Verifier.cs # Verify and verify-full logic
|
||||
```
|
||||
|
||||
## Dependencies
|
||||
|
||||
| Package | Purpose |
|
||||
|---------|---------|
|
||||
| `protobuf-net-data` | Serialize IDataReader to protobuf |
|
||||
| `ZstdSharp.Port` | Zstd compression |
|
||||
| `Microsoft.Data.SqlClient` | SQL Server connectivity |
|
||||
| `Oracle.ManagedDataAccess.Core` | Oracle connectivity |
|
||||
| `System.Text.Json` | Parse definition files |
|
||||
|
||||
**Target Framework:** `net10.0`
|
||||
|
||||
## Testing with ScadaBridge
|
||||
|
||||
**Connection:**
|
||||
```
|
||||
Server=10.100.0.35;Database=ScadaBridge_Test;User Id=sa;Password=ScadaBridge2024;TrustServerCertificate=true;
|
||||
```
|
||||
|
||||
Definition files will be created in `Tools/DbExporter/definitions/` for ScadaBridge tables.
|
||||
|
||||
**Test approach:**
|
||||
1. Build the tool
|
||||
2. Run export for each definition file
|
||||
3. Run `--verify` to confirm row counts and schemas
|
||||
4. Run `--verify-full` on at least one to confirm checksum works
|
||||
Reference in New Issue
Block a user