Files

T

Joseph Doherty 604bfe919c refactor: address code review findings across all projects

Apply comprehensive fixes from code reviews including:
- Extract shared utilities (SqlFormatHelper, CellValueConverter, DbDestinationBase)
- Add interface abstractions (IAuthenticationService, IDatabaseMigrator, IMisQueryBuilder)
- Implement SecureStore for encrypted secrets storage
- Fix error handling with proper HTTP status codes and logging
- Optimize double enumeration in DevEtlRegistry
- Add DataSync.Dev README for developer onboarding
- Extract filter panel base classes to reduce duplication
- Update code review docs to mark all issues as fixed

2026-01-19 11:05:36 -05:00

3.6 KiB

Raw Blame History

JdeScoping.DataSync.Dev

Development-only ETL tooling for loading cached protobuf data into SQL Server. This project enables developers to work with production-like data locally without connecting to live JDE/CMS systems.

Purpose

This project provides a way to load pre-cached data snapshots (in protobuf format with zstd compression) into the local SQL Server database. It is intended only for development and testing - production data sync uses the JdeScoping.DataSync project with live connections to enterprise systems.

Prerequisites

Cache Directory: A folder containing protobuf data files (.pb.zstd format)
SQL Server Database: Local SQL Server instance with the JDE Scoping schema
Connection String: Valid SQL Server connection configured in appsettings.json

Configuration

Pipeline configurations are stored in Pipelines/dev-pipelines.json. This file defines:

Size categories: Tables are categorized as small, medium, large, or veryLarge
Pipeline definitions: Source file mappings to destination tables

Size Categories

Category	Tables	Parallelization
Small	Branch, OrgHierarchy, WorkCenter, ProfitCenter	Parallel
Medium	JdeUser, FunctionCode, Item, RouteMaster, MisData_Curr	Parallel
Large	Lot, MisData_Hist, WorkOrder_Curr/Hist, LotUsage_Hist, WorkOrderComponent_Hist	Parallel
VeryLarge	WorkOrderStep_, WorkOrderComponent_Curr, WorkOrderRouting, LotUsage_Curr, WorkOrderTime_	Sequential

Very large tables run sequentially at the end to avoid I/O contention.

Folder Structure

JdeScoping.DataSync.Dev/
├── Configuration/      # DTOs for JSON config deserialization
├── Contracts/          # Interface definitions (IDevEtlPipelineFactory)
├── Options/            # Options pattern classes
├── Services/           # Implementation (DevEtlPipelineFactory)
├── Sources/            # IImportSource implementations (ProtobufZstdFileSource)
├── Pipelines/          # JSON configuration files
└── DevEtlRegistry.cs   # Main orchestrator class

Usage

Basic Usage

// Create the registry
var factory = new DevEtlPipelineFactory(options, connectionString, logger);
var registry = new DevEtlRegistry(factory, cacheDirectory, logger);

// List available tables
foreach (var table in registry.GetAvailableTables())
{
    Console.WriteLine(table);
}

Run Single Table

var result = await registry.RunAsync("Branch");
if (result.Success)
{
    Console.WriteLine($"Loaded {result.TotalRows} rows in {result.Elapsed}");
}

Run All Tables Sequentially

var results = await registry.RunAllAsync(cancellationToken);

Run All Tables with Parallelization

// Run small/medium/large tables in parallel (max 4 concurrent)
// Very large tables run sequentially at the end
var results = await registry.RunAllParallelAsync(
    maxDegreeOfParallelism: 4,
    cancellationToken);

Data Flow

Source: ProtobufZstdFileSource reads .pb.zstd files using protobuf-net-data
Transform: Data passes through as IDataReader (no transformation)
Destination: Uses JdeScoping.DataSync bulk import/merge destinations

Dependencies

JdeScoping.DataSync: Core ETL pipeline infrastructure
protobuf-net-data: Protobuf serialization with IDataReader support

Testing

The project supports unit testing via InternalsVisibleTo:

JdeScoping.DataSync.Dev.Tests
DynamicProxyGenAssembly2 (for Moq)

3.6 KiB Raw Blame History