# Old ETL Removal Implementation Plan > **For Claude:** REQUIRED SUB-SKILL: Use superpowers:executing-plans to implement this plan task-by-task. **Goal:** Remove legacy ETL implementation and wire orchestration to use new EtlPipeline with JSON config. **Architecture:** Three-phase migration - build new infrastructure first, wire up, then clean up old code. **Tech Stack:** .NET 10, System.Text.Json, EtlPipeline **Working Directory:** All paths are relative to `NEW/` folder. Run `cd /Users/dohertj2/Desktop/JdeScopingTool/NEW` before starting. --- ## Phase 1: Build New Infrastructure ### Task 1: Create Pipeline Configuration Models **Files:** - Create: `src/JdeScoping.DataSync/Configuration/PipelinesRoot.cs` - Create: `src/JdeScoping.DataSync/Configuration/PipelineConfig.cs` - Create: `src/JdeScoping.DataSync/Options/PipelineOptions.cs` **Step 1: Create PipelinesRoot.cs** ```csharp namespace JdeScoping.DataSync.Configuration; public record PipelinesRoot( PipelineSettings? Settings, // Optional - defaults applied if missing Dictionary Pipelines) { public PipelineSettings EffectiveSettings => Settings ?? new PipelineSettings(); } public record PipelineSettings( string Timezone = "UTC"); ``` **Step 2: Create PipelineConfig.cs** ```csharp namespace JdeScoping.DataSync.Configuration; public record PipelineConfig( SourceConfig Source, Dictionary SyncModes, List? Transformers, DestinationConfig Destination, List? PreScripts, List? PostScripts); public record SourceConfig( string Connection, string Query, Dictionary? Parameters); public record ParameterConfig( string Name, string? Format, string Source = "offset", string? Value); public record SyncModeConfig( string? MinDtOffset, bool PrePurge = false, bool ReIndex = false, string? UpdateWhen = null, DestinationOverride? Destination = null); public record DestinationOverride( string? Type, List? MatchColumns, List? ExcludeFromUpdate); public record TransformerConfig( string Type, List? Columns, Dictionary? Mappings); public record DestinationConfig( string Table, List? MatchColumns, List? ExcludeFromUpdate); ``` **Step 3: Create PipelineOptions.cs** ```csharp namespace JdeScoping.DataSync.Options; public class PipelineOptions { public const string SectionName = "Pipelines"; public string ConfigPath { get; set; } = "Pipelines/pipelines.json"; } ``` **Step 4: Build to verify** ```bash dotnet build src/JdeScoping.DataSync/JdeScoping.DataSync.csproj ``` **Step 5: Commit** ```bash git add -A && git commit -m "feat(datasync): add pipeline configuration models" ``` --- ### Task 2: Create ParameterFormatConverter **Files:** - Create: `src/JdeScoping.DataSync/Services/ParameterFormatConverter.cs` - Create: `tests/JdeScoping.DataSync.Tests/Services/ParameterFormatConverterTests.cs` **Step 1: Create ParameterFormatConverter.cs** ```csharp namespace JdeScoping.DataSync.Services; public class ParameterFormatConverter { private readonly TimeZoneInfo _timezone; public ParameterFormatConverter(string timezone) { _timezone = timezone.ToUpperInvariant() switch { "UTC" => TimeZoneInfo.Utc, "LOCAL" => TimeZoneInfo.Local, _ => TimeZoneInfo.FindSystemTimeZoneById(timezone) }; } public object Convert(DateTime value, string? format) { var adjusted = TimeZoneInfo.ConvertTime(value, _timezone); return format?.ToLowerInvariant() switch { "jdejulian" => ToJdeJulianDate(adjusted), "jdetime" => ToJdeTime(adjusted), null => adjusted, _ => throw new ArgumentException($"Unknown format: {format}") }; } public static int ToJdeJulianDate(DateTime date) { int century = date.Year >= 2000 ? 1 : 0; int year = date.Year % 100; int dayOfYear = date.DayOfYear; return century * 100000 + year * 1000 + dayOfYear; } public static int ToJdeTime(DateTime time) { return time.Hour * 10000 + time.Minute * 100 + time.Second; } } ``` **Step 2: Create tests** ```csharp namespace JdeScoping.DataSync.Tests.Services; public class ParameterFormatConverterTests { [Fact] public void ToJdeJulianDate_Year2024Day100_Returns124100() { var date = new DateTime(2024, 4, 9); // Day 100 var result = ParameterFormatConverter.ToJdeJulianDate(date); result.ShouldBe(124100); } [Fact] public void ToJdeJulianDate_Year1999Day365_Returns99365() { var date = new DateTime(1999, 12, 31); var result = ParameterFormatConverter.ToJdeJulianDate(date); result.ShouldBe(99365); } [Fact] public void ToJdeTime_143025_Returns143025() { var time = new DateTime(2024, 1, 1, 14, 30, 25); var result = ParameterFormatConverter.ToJdeTime(time); result.ShouldBe(143025); } [Fact] public void Convert_WithUtcTimezone_UsesUtc() { var converter = new ParameterFormatConverter("UTC"); var utcTime = DateTime.SpecifyKind(new DateTime(2024, 4, 9, 12, 0, 0), DateTimeKind.Utc); var result = converter.Convert(utcTime, "jdeJulian"); result.ShouldBe(124100); } } ``` **Step 3: Run tests** ```bash dotnet test tests/JdeScoping.DataSync.Tests --filter "ParameterFormatConverterTests" ``` **Step 4: Commit** ```bash git add -A && git commit -m "feat(datasync): add ParameterFormatConverter with JDE date/time support" ``` --- ### Task 3: Extend DbQuerySource for Multiple Connections **Files:** - Modify: `src/JdeScoping.DataSync/Etl/Sources/DbQuerySource.cs` - Modify: `tests/JdeScoping.DataSync.Tests/Etl/Sources/DbQuerySourceTests.cs` **Note:** DbQuerySource already exists but only supports LotFinder. Extend it to support JDE and CMS connections. **Step 1: Update DbQuerySource.cs** ```csharp using System.Data; using System.Data.Common; using JdeScoping.DataAccess.Interfaces; using JdeScoping.DataSync.Etl.Contracts; namespace JdeScoping.DataSync.Etl.Sources; public class DbQuerySource : IImportSource { private readonly IDbConnectionFactory _connectionFactory; private readonly string _connectionType; private readonly string _query; private readonly Dictionary _parameters; private DbConnection? _connection; public string SourceName => $"DbQuery:{_connectionType}"; public DbQuerySource( IDbConnectionFactory connectionFactory, string connectionType, string query, Dictionary? parameters = null) { _connectionFactory = connectionFactory ?? throw new ArgumentNullException(nameof(connectionFactory)); _connectionType = connectionType?.ToLowerInvariant() ?? throw new ArgumentNullException(nameof(connectionType)); _query = query ?? throw new ArgumentNullException(nameof(query)); _parameters = parameters ?? new Dictionary(); if (_connectionType is not ("jde" or "cms" or "lotfinder")) throw new ArgumentException($"Unknown connection type: {connectionType}"); } public async Task ReadDataAsync(CancellationToken cancellationToken = default) { _connection = _connectionType switch { "jde" => await _connectionFactory.CreateJdeConnectionAsync(), "cms" => await _connectionFactory.CreateCmsConnectionAsync(), "lotfinder" => await _connectionFactory.CreateLotFinderConnectionAsync(), _ => throw new InvalidOperationException($"Unknown connection type: {_connectionType}") }; var command = _connection.CreateCommand(); command.CommandText = _query; foreach (var (name, value) in _parameters) { var param = command.CreateParameter(); param.ParameterName = name; param.Value = value ?? DBNull.Value; command.Parameters.Add(param); } return await command.ExecuteReaderAsync(CommandBehavior.CloseConnection, cancellationToken); } public async ValueTask DisposeAsync() { if (_connection != null) { await _connection.DisposeAsync(); _connection = null; } } } ``` **Step 2: Create basic tests** ```csharp namespace JdeScoping.DataSync.Tests.Etl.Sources; public class DbQuerySourceTests { [Theory] [InlineData("jde")] [InlineData("cms")] [InlineData("lotfinder")] public void Constructor_ValidConnectionType_Succeeds(string connectionType) { var factory = Substitute.For(); var source = new DbQuerySource(factory, connectionType, "SELECT 1"); source.SourceName.ShouldBe($"DbQuery:{connectionType}"); } [Fact] public void Constructor_InvalidConnectionType_Throws() { var factory = Substitute.For(); Should.Throw(() => new DbQuerySource(factory, "invalid", "SELECT 1")); } [Fact] public void Constructor_NullQuery_Throws() { var factory = Substitute.For(); Should.Throw(() => new DbQuerySource(factory, "jde", null!)); } } ``` **Step 3: Run tests** ```bash dotnet test tests/JdeScoping.DataSync.Tests --filter "DbQuerySourceTests" ``` **Step 4: Commit** ```bash git add -A && git commit -m "feat(datasync): add generic DbQuerySource for JDE/CMS/LotFinder" ``` --- ### Task 4: Extend DbBulkMergeDestination **Files:** - Modify: `src/JdeScoping.DataSync/Etl/Destinations/DbBulkMergeDestination.cs` - Create/Modify: `tests/JdeScoping.DataSync.Tests/Etl/Destinations/DbBulkMergeDestinationTests.cs` **Step 1: Add excludeFromUpdate and updateCondition parameters** Add to constructor: ```csharp public DbBulkMergeDestination( IDbConnectionFactory connectionFactory, string tableName, string[] matchColumns, string[]? excludeFromUpdate = null, string? updateCondition = null) ``` **Step 2: Modify MERGE SQL generation to use new parameters** Update the WHEN MATCHED clause to include condition and exclude columns. **Step 3: Add tests for new functionality** **Step 4: Run tests** ```bash dotnet test tests/JdeScoping.DataSync.Tests --filter "DbBulkMergeDestinationTests" ``` **Step 5: Commit** ```bash git add -A && git commit -m "feat(datasync): extend DbBulkMergeDestination with excludeFromUpdate and updateCondition" ``` --- ### Task 5: Create IEtlPipelineFactory and Contracts **Files:** - Create: `src/JdeScoping.DataSync/Contracts/IEtlPipelineFactory.cs` - Create: `src/JdeScoping.DataSync/Contracts/SyncMode.cs` **Step 1: Create IEtlPipelineFactory.cs** ```csharp namespace JdeScoping.DataSync.Contracts; public interface IEtlPipelineFactory { IEtlPipelineBuilder ForTable(string tableName); } public interface IEtlPipelineBuilder { IEtlPipelineBuilder WithMode(SyncMode mode); IEtlPipelineBuilder WithMinimumDate(DateTime? minDt); EtlPipeline Build(); } ``` **Step 2: Create SyncMode.cs** ```csharp namespace JdeScoping.DataSync.Contracts; public enum SyncMode { Mass, Incremental } ``` **Step 3: Build to verify** ```bash dotnet build src/JdeScoping.DataSync/JdeScoping.DataSync.csproj ``` **Step 4: Commit** ```bash git add -A && git commit -m "feat(datasync): add IEtlPipelineFactory and SyncMode contracts" ``` --- ### Task 6: Create EtlPipelineFactory **Files:** - Create: `src/JdeScoping.DataSync/Services/EtlPipelineFactory.cs` - Create: `tests/JdeScoping.DataSync.Tests/Services/EtlPipelineFactoryTests.cs` **Step 1: Create EtlPipelineFactory.cs** Implement the factory with: - Config loading with validation - PipelineBuilder inner class - Source/destination/transformer creation methods **Step 2: Add tests for config loading and validation** **Step 3: Run tests** ```bash dotnet test tests/JdeScoping.DataSync.Tests --filter "EtlPipelineFactoryTests" ``` **Step 4: Commit** ```bash git add -A && git commit -m "feat(datasync): add EtlPipelineFactory with JSON config support" ``` --- ### Task 7: Create pipelines.json Config File **Files:** - Create: `src/JdeScoping.DataSync/Pipelines/pipelines.json` - Modify: `src/JdeScoping.DataSync/JdeScoping.DataSync.csproj` **Step 1: Extract data from existing merge configurations** Read existing merge configs to extract for each table: - `MatchOn` → `matchColumns` - `UpdateColumns` / `InsertColumns` → derive `excludeFromUpdate` - `UpdateWhen` → `updateCondition` Files to reference: - `src/JdeScoping.DataSync/Configuration/MergeConfigurations/*.cs` - `src/JdeScoping.DataSync/Fetchers/Jde/*.cs` (for queries) - `src/JdeScoping.DataSync/Fetchers/Cms/*.cs` (for CMS query) **Step 2: Create Pipelines directory and pipelines.json** Create config for all 9 tables: - WorkOrder_Curr - Lot - LotUsage - Item - WorkCenter - ProfitCenter - JdeUser - Branch - MisData **Important:** For MisData, add the post-processing SQL as a postScript: ```json "postScripts": [ "UPDATE MisData SET ProcessedFlag = 1 WHERE ProcessedFlag IS NULL" ] ``` This replaces the MisDataPostProcessor class. **Step 3: Add Content item to csproj** ```xml PreserveNewest ``` **Step 4: Build to verify config copies** ```bash dotnet build src/JdeScoping.DataSync/JdeScoping.DataSync.csproj ls src/JdeScoping.DataSync/bin/Debug/net10.0/Pipelines/ ``` **Step 5: Commit** ```bash git add -A && git commit -m "feat(datasync): add pipelines.json config for all sync tables" ``` --- ## Phase 2: Wire Up ### Task 8: Update DependencyInjection.cs **Files:** - Modify: `src/JdeScoping.DataSync/DependencyInjection.cs` **Step 1: Add new registrations (alongside old for now)** ```csharp // Add pipeline factory services.AddOptions() .Bind(configuration.GetSection(PipelineOptions.SectionName)); services.AddSingleton(); ``` **Step 2: Build to verify** ```bash dotnet build src/JdeScoping.DataSync/JdeScoping.DataSync.csproj ``` **Step 3: Commit** ```bash git add -A && git commit -m "feat(datasync): register EtlPipelineFactory in DI" ``` --- ### Task 9: Update TableSyncOperation **Files:** - Modify: `src/JdeScoping.DataSync/Services/TableSyncOperation.cs` **Step 1: Inject IEtlPipelineFactory** **Step 2: Replace old sync logic with pipeline execution** ```csharp var pipeline = _pipelineFactory .ForTable(config.TableName) .WithMode(updateTask.IsMassUpdate ? SyncMode.Mass : SyncMode.Incremental) .WithMinimumDate(updateTask.MinimumDt) .Build(); var result = await pipeline.ExecuteAsync(cancellationToken); if (!result.Success) throw new InvalidOperationException($"Pipeline failed for {config.TableName}: {result.ErrorMessage}"); // Important: Pass row count to DataUpdateRepository for metrics var recordCount = result.TotalRows; // Use this for DataUpdate record ``` **Step 3: Build to verify** ```bash dotnet build src/JdeScoping.DataSync/JdeScoping.DataSync.csproj ``` **Step 4: Commit** ```bash git add -A && git commit -m "feat(datasync): wire TableSyncOperation to use EtlPipelineFactory" ``` --- --- ## Phase 3: Clean Up **Important:** Tasks in Phase 3 must be executed in order. DataSourceConfig changes come AFTER test and appsettings updates to avoid broken builds. ### Task 10: Remove Old DI Registrations **Files:** - Modify: `src/JdeScoping.DataSync/DependencyInjection.cs` **Step 1: Remove old registrations** - Remove `using JdeScoping.DataSync.Generated;` - Remove all `IDataFetcher` registrations - Remove all `IMergeConfiguration` registrations - Remove `IBulkMergeHelper`, `IDataReaderFactory`, `ISchemaValidator` - Remove `IMergeConfigurationRegistry` - Remove `IPostProcessor`, `MisDataPostProcessor` - Remove named fetcher registrations **Step 2: Build to verify** ```bash dotnet build src/JdeScoping.DataSync/JdeScoping.DataSync.csproj ``` **Step 3: Commit** ```bash git add -A && git commit -m "refactor(datasync): remove old ETL DI registrations" ``` --- ### Task 11: Delete Old Source Files **Files to delete:** - `BulkCopyTypeRegistry.cs` - `Contracts/IBulkMergeHelper.cs` - `Contracts/IDataFetcher.cs` - `Contracts/IDataReaderFactory.cs` - `Contracts/IMergeConfiguration.cs` - `Contracts/IMergeConfigurationRegistry.cs` - `Contracts/IPostProcessor.cs` - `Contracts/ISchemaValidator.cs` - `Configuration/MergeConfigurations/` (all 9 files) - `Exceptions/BulkMergeException.cs` - `Fetchers/` (all files) - `Models/ColumnSchema.cs` - `Models/MergeResult.cs` - `Services/BulkMergeHelper.cs` - `Services/ExpressionParser.cs` - `Services/MergeConfigurationRegistry.cs` - `Services/MergeSqlBuilder.cs` - `Services/MisDataPostProcessor.cs` - `Services/SchemaValidator.cs` **Step 1: Delete files** ```bash rm src/JdeScoping.DataSync/BulkCopyTypeRegistry.cs rm -rf src/JdeScoping.DataSync/Contracts/IBulkMergeHelper.cs # ... etc ``` **Step 2: Build to verify** ```bash dotnet build src/JdeScoping.DataSync/JdeScoping.DataSync.csproj ``` **Step 3: Commit** ```bash git add -A && git commit -m "refactor(datasync): delete old ETL source files" ``` --- ### Task 12: Delete Integration Tests Project **Files:** - Delete: `tests/JdeScoping.DataSync.IntegrationTests/` (entire project) - Modify: `JdeScoping.slnx` **Note:** Must delete integration tests BEFORE removing SourceGenerator, as integration tests reference generated code. **Step 1: Remove project from solution** ```bash dotnet sln JdeScoping.slnx remove tests/JdeScoping.DataSync.IntegrationTests/JdeScoping.DataSync.IntegrationTests.csproj ``` **Step 2: Delete project folder** ```bash rm -rf tests/JdeScoping.DataSync.IntegrationTests ``` **Step 3: Build to verify** ```bash dotnet build JdeScoping.slnx ``` **Step 4: Commit** ```bash git add -A && git commit -m "refactor(datasync): remove obsolete integration tests project" ``` --- ### Task 13: Delete SourceGenerator Project **Files:** - Delete: `src/JdeScoping.DataSync.SourceGenerators/` (entire project) - Modify: `JdeScoping.slnx` - Modify: `src/JdeScoping.DataSync/JdeScoping.DataSync.csproj` **Step 1: Remove project reference from DataSync.csproj** **Step 2: Remove project from solution** ```bash dotnet sln JdeScoping.slnx remove src/JdeScoping.DataSync.SourceGenerators/JdeScoping.DataSync.SourceGenerators.csproj ``` **Step 3: Delete project folder** ```bash rm -rf src/JdeScoping.DataSync.SourceGenerators ``` **Step 4: Build to verify** ```bash dotnet build JdeScoping.slnx ``` **Step 5: Commit** ```bash git add -A && git commit -m "refactor(datasync): remove SourceGenerator project" ``` --- ### Task 14: Delete Old Unit Test Files **Files to delete:** - `tests/JdeScoping.DataSync.Tests/Services/BulkMergeHelperTests.cs` - `tests/JdeScoping.DataSync.Tests/Services/ExpressionParserTests.cs` - `tests/JdeScoping.DataSync.Tests/Services/MergeConfigurationRegistryTests.cs` - `tests/JdeScoping.DataSync.Tests/Services/MergeSqlBuilderTests.cs` - `tests/JdeScoping.DataSync.Tests/Services/SchemaValidatorTests.cs` - `tests/JdeScoping.DataSync.Tests/TableSyncOperationTests.cs` **Step 1: Delete test files** **Step 2: Run remaining tests** ```bash dotnet test tests/JdeScoping.DataSync.Tests ``` **Step 3: Commit** ```bash git add -A && git commit -m "refactor(datasync): delete obsolete test files" ``` --- ### Task 15: Update Remaining Tests **Files:** - Modify: `tests/JdeScoping.DataSync.Tests/ScheduleCheckerTests.cs` - Modify: `tests/JdeScoping.DataSync.Tests/SyncOrchestratorTests.cs` **Step 1: Remove FetcherTypeName references from test fixtures** **Step 2: Run tests** ```bash dotnet test tests/JdeScoping.DataSync.Tests ``` **Step 3: Commit** ```bash git add -A && git commit -m "refactor(datasync): update tests to remove FetcherTypeName" ``` --- ### Task 16: Update appsettings Files **Files:** - Modify: `src/JdeScoping.Host/appsettings.json` - Modify: `src/JdeScoping.Host/appsettings.Development.json` **Step 1: Remove obsolete properties from DataSources config** - FetcherTypeName - PostProcessorTypeName - PrepurgeData - ReIndexData **Step 2: Build and run to verify** ```bash dotnet build src/JdeScoping.Host/JdeScoping.Host.csproj ``` **Step 3: Commit** ```bash git add -A && git commit -m "refactor(datasync): remove obsolete appsettings properties" ``` --- ### Task 17: Update DataSourceConfig **Files:** - Modify: `src/JdeScoping.DataSync/Options/DataSourceConfig.cs` **Note:** This task comes AFTER test and appsettings updates to avoid broken builds. **Step 1: Remove obsolete properties** - FetcherTypeName - PostProcessorTypeName - PrepurgeData - ReIndexData **Step 2: Build to verify** ```bash dotnet build JdeScoping.slnx ``` **Step 3: Commit** ```bash git add -A && git commit -m "refactor(datasync): remove obsolete DataSourceConfig properties" ``` --- ### Task 18: Final Verification **Step 1: Full build** ```bash dotnet build JdeScoping.slnx ``` **Step 2: Run all tests** ```bash dotnet test JdeScoping.slnx ``` **Step 3: Commit any fixes** **Step 4: Create summary commit** ```bash git add -A && git commit -m "feat(datasync): complete migration to JSON-configured ETL pipelines - Remove legacy Fetchers, MergeConfigurations, BulkMerge services - Remove SourceGenerator project - Add EtlPipelineFactory with JSON config - Add DbQuerySource for JDE/CMS/LotFinder connections - Extend DbBulkMergeDestination with excludeFromUpdate and updateCondition - Wire TableSyncOperation to use new pipeline factory - Update all tests and configuration" ```