d49330e697
Add comprehensive XML documentation (param/returns tags) across 132 source files to improve IntelliSense and API discoverability. Include ConfigManager design documents and implementation plans for phases 1-9.
JdeScoping.DataSync.Dev
Development-only ETL tooling for loading cached protobuf data into SQL Server. This project enables developers to work with production-like data locally without connecting to live JDE/CMS systems.
Purpose
This project provides a way to load pre-cached data snapshots (in protobuf format with zstd compression) into the local SQL Server database. It is intended only for development and testing - production data sync uses the JdeScoping.DataSync project with live connections to enterprise systems.
Prerequisites
- Cache Directory: A folder containing protobuf data files (
.pb.zstdformat) - SQL Server Database: Local SQL Server instance with the JDE Scoping schema
- Connection String: Valid SQL Server connection configured in
appsettings.json
Configuration
Pipeline configurations are stored in Pipelines/dev-pipelines.json. This file defines:
- Size categories: Tables are categorized as small, medium, large, or veryLarge
- Pipeline definitions: Source file mappings to destination tables
Size Categories
| Category | Tables | Parallelization |
|---|---|---|
| Small | Branch, OrgHierarchy, WorkCenter, ProfitCenter | Parallel |
| Medium | JdeUser, FunctionCode, Item, RouteMaster, MisData_Curr | Parallel |
| Large | Lot, MisData_Hist, WorkOrder_Curr/Hist, LotUsage_Hist, WorkOrderComponent_Hist | Parallel |
| VeryLarge | WorkOrderStep_, WorkOrderComponent_Curr, WorkOrderRouting, LotUsage_Curr, WorkOrderTime_ | Sequential |
Very large tables run sequentially at the end to avoid I/O contention.
Folder Structure
JdeScoping.DataSync.Dev/
├── Configuration/ # DTOs for JSON config deserialization
├── Contracts/ # Interface definitions (IDevEtlPipelineFactory)
├── Options/ # Options pattern classes
├── Services/ # Implementation (DevEtlPipelineFactory)
├── Sources/ # IImportSource implementations (ProtobufZstdFileSource)
├── Pipelines/ # JSON configuration files
└── DevEtlRegistry.cs # Main orchestrator class
Usage
Basic Usage
// Create the registry
var factory = new DevEtlPipelineFactory(options, connectionString, logger);
var registry = new DevEtlRegistry(factory, cacheDirectory, logger);
// List available tables
foreach (var table in registry.GetAvailableTables())
{
Console.WriteLine(table);
}
Run Single Table
var result = await registry.RunAsync("Branch");
if (result.Success)
{
Console.WriteLine($"Loaded {result.TotalRows} rows in {result.Elapsed}");
}
Run All Tables Sequentially
var results = await registry.RunAllAsync(cancellationToken);
Run All Tables with Parallelization
// Run small/medium/large tables in parallel (max 4 concurrent)
// Very large tables run sequentially at the end
var results = await registry.RunAllParallelAsync(
maxDegreeOfParallelism: 4,
cancellationToken);
Data Flow
- Source:
ProtobufZstdFileSourcereads.pb.zstdfiles using protobuf-net-data - Transform: Data passes through as
IDataReader(no transformation) - Destination: Uses
JdeScoping.DataSyncbulk import/merge destinations
Dependencies
- JdeScoping.DataSync: Core ETL pipeline infrastructure
- protobuf-net-data: Protobuf serialization with IDataReader support
Testing
The project supports unit testing via InternalsVisibleTo:
JdeScoping.DataSync.Dev.TestsDynamicProxyGenAssembly2(for Moq)