# DbExporter Tool Design ## Purpose A command-line tool that queries databases (SQL Server or Oracle) and exports results to compressed protobuf files using `protobuf-net-data` and zstd compression. ## CLI Interface ``` Usage: DbExporter [options] Arguments: definition-file Path to JSON definition file Options: --verify Verify output (row count + schema) --verify-full Verify output with SHA256 checksum --help Show help ``` **Examples:** ```bash # Export data dotnet run -- ./definitions/scada-clients.json # Export and verify dotnet run -- ./definitions/scada-clients.json --verify # Full verification with checksum dotnet run -- ./definitions/scada-clients.json --verify-full ``` ## Definition File Format (JSON) ```json { "providerType": "SqlServer", "connectionString": "Server=...;Database=...;User Id=...;Password=...;", "query": "SELECT * FROM MyTable", "outputPath": "./output/mytable.pb.zstd", "compressionLevel": 10 } ``` | Field | Required | Default | Description | |-------|----------|---------|-------------| | `providerType` | Yes | - | `"SqlServer"` or `"Oracle"` | | `connectionString` | Yes | - | ADO.NET connection string | | `query` | Yes | - | SQL query to execute | | `outputPath` | Yes | - | Output file path (.pb.zstd) | | `compressionLevel` | No | `10` | Zstd level 1-19 (higher = smaller, slower) | ## Core Workflow ### Export Flow 1. Parse definition file (JSON) 2. Validate fields (provider type, connection string, query) 3. Create appropriate DbConnection (SqlConnection or OracleConnection) 4. Execute query → IDataReader 5. Serialize IDataReader → protobuf stream (via protobuf-net-data) 6. Compress protobuf stream → zstd (via ZstdSharp) 7. While writing, compute SHA256 incrementally 8. Write to output file + sidecar .sha256 file 9. Print summary: row count, file size, compression ratio ### Verify Flow (--verify) 1. Open output file 2. Decompress zstd → protobuf stream 3. Deserialize protobuf → IDataReader 4. Loop through all rows, count them (streaming) 5. Extract schema (column names + types) 6. Print: ✓ row count, schema ### Verify-Full Flow (--verify-full) 1. Open output file 2. Decompress zstd → stream protobuf data 3. While streaming: count rows, extract schema, compute SHA256 incrementally 4. Compare computed SHA256 to stored sidecar file 5. Print: ✓ row count, schema, checksum match/mismatch ## Project Structure ``` Tools/DbExporter/ ├── DbExporter.csproj ├── Program.cs # CLI entry point, argument parsing ├── ExportDefinition.cs # JSON model for definition file ├── DatabaseExporter.cs # Core export logic └── Verifier.cs # Verify and verify-full logic ``` ## Dependencies | Package | Purpose | |---------|---------| | `protobuf-net-data` | Serialize IDataReader to protobuf | | `ZstdSharp.Port` | Zstd compression | | `Microsoft.Data.SqlClient` | SQL Server connectivity | | `Oracle.ManagedDataAccess.Core` | Oracle connectivity | | `System.Text.Json` | Parse definition files | **Target Framework:** `net10.0` ## Testing with ScadaBridge **Connection:** ``` Server=10.100.0.35;Database=ScadaBridge_Test;User Id=sa;Password=ScadaBridge2024;TrustServerCertificate=true; ``` Definition files will be created in `Tools/DbExporter/definitions/` for ScadaBridge tables. **Test approach:** 1. Build the tool 2. Run export for each definition file 3. Run `--verify` to confirm row counts and schemas 4. Run `--verify-full` on at least one to confirm checksum works