# JdeScoping.DataSync.Dev Development-only ETL tooling for loading cached protobuf data into SQL Server. This project enables developers to work with production-like data locally without connecting to live JDE/CMS systems. ## Purpose This project provides a way to load pre-cached data snapshots (in protobuf format with zstd compression) into the local SQL Server database. It is intended **only for development and testing** - production data sync uses the `JdeScoping.DataSync` project with live connections to enterprise systems. ## Prerequisites 1. **Cache Directory**: A folder containing protobuf data files (`.pb.zstd` format) 2. **SQL Server Database**: Local SQL Server instance with the JDE Scoping schema 3. **Connection String**: Valid SQL Server connection configured in `appsettings.json` ## Configuration Pipeline configurations are stored in `Pipelines/dev-pipelines.json`. This file defines: - **Size categories**: Tables are categorized as small, medium, large, or veryLarge - **Pipeline definitions**: Source file mappings to destination tables ### Size Categories | Category | Tables | Parallelization | |----------|--------|-----------------| | Small | Branch, OrgHierarchy, WorkCenter, ProfitCenter | Parallel | | Medium | JdeUser, FunctionCode, Item, RouteMaster, MisData_Curr | Parallel | | Large | Lot, MisData_Hist, WorkOrder_Curr/Hist, LotUsage_Hist, WorkOrderComponent_Hist | Parallel | | VeryLarge | WorkOrderStep_*, WorkOrderComponent_Curr, WorkOrderRouting, LotUsage_Curr, WorkOrderTime_* | Sequential | Very large tables run sequentially at the end to avoid I/O contention. ## Folder Structure ``` JdeScoping.DataSync.Dev/ ├── Configuration/ # DTOs for JSON config deserialization ├── Contracts/ # Interface definitions (IDevEtlPipelineFactory) ├── Options/ # Options pattern classes ├── Services/ # Implementation (DevEtlPipelineFactory) ├── Sources/ # IImportSource implementations (ProtobufZstdFileSource) ├── Pipelines/ # JSON configuration files └── DevEtlRegistry.cs # Main orchestrator class ``` ## Usage ### Basic Usage ```csharp // Create the registry var factory = new DevEtlPipelineFactory(options, connectionString, logger); var registry = new DevEtlRegistry(factory, cacheDirectory, logger); // List available tables foreach (var table in registry.GetAvailableTables()) { Console.WriteLine(table); } ``` ### Run Single Table ```csharp var result = await registry.RunAsync("Branch"); if (result.Success) { Console.WriteLine($"Loaded {result.TotalRows} rows in {result.Elapsed}"); } ``` ### Run All Tables Sequentially ```csharp var results = await registry.RunAllAsync(cancellationToken); ``` ### Run All Tables with Parallelization ```csharp // Run small/medium/large tables in parallel (max 4 concurrent) // Very large tables run sequentially at the end var results = await registry.RunAllParallelAsync( maxDegreeOfParallelism: 4, cancellationToken); ``` ## Data Flow 1. **Source**: `ProtobufZstdFileSource` reads `.pb.zstd` files using protobuf-net-data 2. **Transform**: Data passes through as `IDataReader` (no transformation) 3. **Destination**: Uses `JdeScoping.DataSync` bulk import/merge destinations ## Dependencies - **JdeScoping.DataSync**: Core ETL pipeline infrastructure - **protobuf-net-data**: Protobuf serialization with IDataReader support ## Testing The project supports unit testing via `InternalsVisibleTo`: - `JdeScoping.DataSync.Dev.Tests` - `DynamicProxyGenAssembly2` (for Moq)