# Bulk Merge Helper Implementation Plan **Date:** 2026-01-01 **Design:** [2026-01-01-bulk-merge-helper-design.md](./2026-01-01-bulk-merge-helper-design.md) **Status:** Draft - Pending Review ## Prerequisites - .NET 10 SDK installed - SQL Server running (Docker container for tests) - Existing DataSync project compiles ## Phase 1: Source Generator Project Setup ### Task 1.1: Create Source Generator Project **Location:** `NEW/src/JdeScoping.DataSync.SourceGenerators/` **Files to create:** 1. `JdeScoping.DataSync.SourceGenerators.csproj` ```xml netstandard2.0 latest true true ``` 2. `DataReaderGenerator.cs` - Main incremental source generator **Verification:** Project compiles with `dotnet build` ### Task 1.2: Add Generator Reference to DataSync **File:** `NEW/src/JdeScoping.DataSync/JdeScoping.DataSync.csproj` **Add:** ```xml ``` **Verification:** DataSync project compiles --- ## Phase 2: Core Interfaces and Contracts ### Task 2.1: Create IDataReaderFactory Interface **File:** `NEW/src/JdeScoping.DataSync/Contracts/IDataReaderFactory.cs` ```csharp namespace JdeScoping.DataSync.Contracts; public interface IDataReaderFactory { IDataReader CreateReader(IAsyncEnumerable source); IReadOnlyList GetColumnNames(); } ``` **Verification:** Compiles ### Task 2.2: Create IBulkMergeHelper Interface **File:** `NEW/src/JdeScoping.DataSync/Contracts/IBulkMergeHelper.cs` ```csharp namespace JdeScoping.DataSync.Contracts; public interface IBulkMergeHelper { Task MergeAsync( IAsyncEnumerable data, string destinationTable, Expression> matchOn, Expression>? updateColumns = null, Expression>? updateWhen = null, Expression>? insertColumns = null, string? tempTableName = null, int batchSize = 0, bool validateBeforeCopy = false, CancellationToken cancellationToken = default); } ``` **Verification:** Compiles ### Task 2.3: Create MergeResult Record **File:** `NEW/src/JdeScoping.DataSync/Models/MergeResult.cs` ```csharp namespace JdeScoping.DataSync.Models; public record MergeResult( int TotalRowsProcessed, int RowsInserted, int RowsUpdated, int BatchCount, TimeSpan Elapsed); ``` **Verification:** Compiles ### Task 2.4: Create Exception Classes **File:** `NEW/src/JdeScoping.DataSync/Exceptions/BulkMergeException.cs` ```csharp namespace JdeScoping.DataSync.Exceptions; public class BulkMergeException : Exception { public string TableName { get; init; } = string.Empty; public int BatchNumber { get; init; } public int RowsInBatch { get; init; } public string? SqlStatement { get; init; } // constructors... } public class BulkMergeValidationException : BulkMergeException { public IReadOnlyList Errors { get; init; } = []; } public record ValidationError( int RowIndex, string ColumnName, object? Value, string Message); ``` **Verification:** Compiles --- ## Phase 3: Type Registry and Generator Implementation ### Task 3.1: Create BulkCopyTypeRegistry **File:** `NEW/src/JdeScoping.DataSync/BulkCopyTypeRegistry.cs` ```csharp namespace JdeScoping.DataSync; public static class BulkCopyTypeRegistry { public static readonly Type[] Types = [ typeof(WorkOrder), typeof(Lot), typeof(LotUsage), typeof(Item), typeof(WorkCenter), typeof(ProfitCenter), typeof(JdeUser), typeof(Branch), typeof(MisData), ]; } ``` **Verification:** Compiles with correct type references ### Task 3.2: Implement DataReaderGenerator **File:** `NEW/src/JdeScoping.DataSync.SourceGenerators/DataReaderGenerator.cs` Generator must: 1. Find `BulkCopyTypeRegistry.Types` array in compilation 2. For each type, generate a `{TypeName}DataReader : IDataReader` class 3. Generate `DataReaderFactory` implementation 4. Generate `AddBulkCopyConverters()` extension method **Key implementation details:** - Use incremental generator (`IIncrementalGenerator`) for performance - Handle nullable properties correctly (use `DBNull.Value` for null) - Skip properties with private setters - Order columns alphabetically for consistency **Verification:** - Generator compiles - DataSync builds and generated code appears in `obj/Generated/` ### Task 3.3: Write Generator Unit Tests **File:** `NEW/tests/JdeScoping.DataSync.Tests/SourceGenerators/DataReaderGeneratorTests.cs` Test scenarios: - Generates reader for simple type - Generates factory with all registered types - Handles nullable properties - Skips private properties - Generates correct column ordinal mapping **Verification:** All generator tests pass --- ## Phase 4: Expression Parsing ### Task 4.1: Create ExpressionParser **File:** `NEW/src/JdeScoping.DataSync/Services/ExpressionParser.cs` ```csharp namespace JdeScoping.DataSync.Services; internal static class ExpressionParser { public static IReadOnlyList GetColumnNames( Expression> expression); public static string BuildUpdateWhenSql( Expression>? expression, string sourceAlias, string targetAlias); } ``` **Handles:** - Single property: `x => x.Id` → `["Id"]` - Anonymous type: `x => new { x.A, x.B }` → `["A", "B"]` - Comparison expressions for `updateWhen` **Verification:** Compiles ### Task 4.2: Write ExpressionParser Unit Tests **File:** `NEW/tests/JdeScoping.DataSync.Tests/Services/ExpressionParserTests.cs` Test scenarios: - Single property extraction - Multiple properties via anonymous type - Nested property access throws helpful error - Comparison expression SQL generation - Complex boolean expressions (AND, OR) **Verification:** All tests pass --- ## Phase 5: SQL Builder ### Task 5.1: Create MergeSqlBuilder **File:** `NEW/src/JdeScoping.DataSync/Services/MergeSqlBuilder.cs` ```csharp namespace JdeScoping.DataSync.Services; internal static class MergeSqlBuilder { public static string BuildCreateTempTable( string tempTableName, string sourceTableName); public static string BuildMerge( string destinationTable, string tempTableName, IReadOnlyList matchColumns, IReadOnlyList updateColumns, string? updateWhenClause, IReadOnlyList insertColumns); } ``` **Verification:** Compiles ### Task 5.2: Write MergeSqlBuilder Unit Tests **File:** `NEW/tests/JdeScoping.DataSync.Tests/Services/MergeSqlBuilderTests.cs` Test scenarios: - Creates temp table with SELECT INTO - MERGE with single match column - MERGE with composite key - MERGE with updateWhen condition - MERGE with subset of update columns - MERGE with all columns for insert - Proper SQL escaping of column names **Verification:** All tests pass --- ## Phase 6: Schema Validation ### Task 6.1: Create SchemaValidator **File:** `NEW/src/JdeScoping.DataSync/Services/SchemaValidator.cs` ```csharp namespace JdeScoping.DataSync.Services; internal sealed class SchemaValidator { public async Task LoadSchemaAsync( SqlConnection connection, string tableName); public IReadOnlyList Validate( IEnumerable rows, TableSchema schema, IReadOnlyList columnNames); } internal record TableSchema( IReadOnlyDictionary Columns); internal record ColumnSchema( string Name, Type ClrType, bool IsNullable, int? MaxLength, byte? Precision, byte? Scale); ``` **Verification:** Compiles ### Task 6.2: Write SchemaValidator Unit Tests **File:** `NEW/tests/JdeScoping.DataSync.Tests/Services/SchemaValidatorTests.cs` Test scenarios: - Detects string exceeding max length - Detects null in non-nullable column - Detects decimal precision overflow - Returns multiple errors for row - Includes row index in errors **Verification:** All tests pass --- ## Phase 7: BulkMergeHelper Implementation ### Task 7.1: Implement BulkMergeHelper **File:** `NEW/src/JdeScoping.DataSync/Services/BulkMergeHelper.cs` ```csharp namespace JdeScoping.DataSync.Services; public sealed class BulkMergeHelper : IBulkMergeHelper { private readonly IDataReaderFactory _readerFactory; private readonly IDbConnectionFactory _connectionFactory; private readonly ILogger _logger; private readonly DataSyncOptions _options; public async Task MergeAsync(...) { ... } } ``` **Implementation flow:** 1. Parse expressions 2. Open connection 3. Create temp table 4. Loop: batch → validate? → bulk copy → merge → truncate 5. Finally: drop temp table 6. Return result **Verification:** Compiles ### Task 7.2: Write BulkMergeHelper Unit Tests **File:** `NEW/tests/JdeScoping.DataSync.Tests/Services/BulkMergeHelperTests.cs` Test scenarios (use mocks): - Calls factory to create reader - Builds correct SQL from expressions - Handles empty data source - Respects batch size - Wraps SqlException with context - Invokes validation when flag set - Drops temp table on failure **Verification:** All tests pass --- ## Phase 8: DI Registration ### Task 8.1: Update ServiceCollectionExtensions **File:** `NEW/src/JdeScoping.DataSync/DependencyInjection/ServiceCollectionExtensions.cs` **Add to existing method:** ```csharp // Add bulk copy converters (generated) services.AddBulkCopyConverters(); // Add bulk merge helper services.AddScoped(); ``` **Verification:** Compiles, DI container builds correctly --- ## Phase 9: Integration Tests ### Task 9.1: Create BulkMergeHelper Integration Tests **File:** `NEW/tests/JdeScoping.DataSync.IntegrationTests/Services/BulkMergeHelperIntegrationTests.cs` Test scenarios: - Inserts new records to empty table - Updates existing records - Conditional update respects updateWhen - Composite primary key matching works - Handles 10k+ records - Temp table cleaned up on success - Temp table cleaned up on failure **Verification:** All integration tests pass ### Task 9.2: Create Batching Integration Tests **File:** `NEW/tests/JdeScoping.DataSync.IntegrationTests/Services/BatchingIntegrationTests.cs` Test scenarios: - Processes 50k records in batches of 10k - Each batch commits independently - Partial failure leaves earlier batches committed - Result contains correct batch count **Verification:** All tests pass ### Task 9.3: Create Validation Integration Tests **File:** `NEW/tests/JdeScoping.DataSync.IntegrationTests/Services/ValidationIntegrationTests.cs` Test scenarios: - Validation catches string truncation - Validation catches null violation - Validation error includes row details - Without validation, gets SqlException **Verification:** All tests pass --- ## Phase 10: Migration - Update Existing Code ### Task 10.1: Update TableSyncOperation **File:** `NEW/src/JdeScoping.DataSync/Services/TableSyncOperation.cs` **Changes:** - Inject `IBulkMergeHelper` instead of `IStagingTableManager` - Replace staging table calls with single `MergeAsync` call - Update mass update path to use `MergeAsync` with `batchSize: 0` - Keep post-processor invocation **Verification:** Compiles ### Task 10.2: Update DataSourceConfig for Expressions **File:** `NEW/src/JdeScoping.DataSync/Configuration/DataSourceConfig.cs` **Consider:** How to store/configure match/update expressions per table. Options: 1. Each fetcher returns its merge config 2. Convention: use primary key for match, all columns for update 3. Attribute on model classes (rejected - Core stays clean) **Recommended:** Convention with optional override in fetcher. **Verification:** Compiles ### Task 10.3: Update TableSyncOperation Tests **File:** `NEW/tests/JdeScoping.DataSync.Tests/Services/TableSyncOperationTests.cs` **Changes:** - Mock `IBulkMergeHelper` instead of `IStagingTableManager` - Update assertions for new call patterns **Verification:** All tests pass --- ## Phase 11: Cleanup ### Task 11.1: Remove Old Bulk Merge Code **Files to delete:** - `NEW/src/JdeScoping.DataSync/Contracts/IStagingTableManager.cs` - `NEW/src/JdeScoping.DataSync/Services/StagingTableManager.cs` - `NEW/tests/JdeScoping.DataSync.Tests/Services/StagingTableManagerTests.cs` - `NEW/tests/JdeScoping.DataSync.IntegrationTests/Services/StagingTableManagerTests.cs` **Files to update:** - `NEW/src/JdeScoping.DataSync/DependencyInjection/ServiceCollectionExtensions.cs` - Remove `IStagingTableManager` registration - `NEW/src/JdeScoping.Data/Repositories/LotFinderRepository.DataSync.cs` - Remove unused bulk methods if any **Verification:** - Solution compiles - All tests pass - No references to deleted types ### Task 11.2: Final Verification **Commands:** ```bash dotnet build dotnet test ``` **Verification:** - Zero build warnings related to new code - All tests pass - Integration tests pass against SQL Server --- ## Phase 12: Codex Review ### Task 12.1: Consult Codex for Gaps Use Codex MCP to review: - Generated code efficiency - Missing edge cases - Performance considerations for large datasets - Error handling completeness - Thread safety concerns **Verification:** Address any issues found --- ## Summary Checklist | Phase | Tasks | Status | |-------|-------|--------| | 1. Generator Project | 1.1-1.2 | Pending | | 2. Contracts | 2.1-2.4 | Pending | | 3. Type Registry & Generator | 3.1-3.3 | Pending | | 4. Expression Parsing | 4.1-4.2 | Pending | | 5. SQL Builder | 5.1-5.2 | Pending | | 6. Schema Validation | 6.1-6.2 | Pending | | 7. BulkMergeHelper | 7.1-7.2 | Pending | | 8. DI Registration | 8.1 | Pending | | 9. Integration Tests | 9.1-9.3 | Pending | | 10. Migration | 10.1-10.3 | Pending | | 11. Cleanup | 11.1-11.2 | Pending | | 12. Codex Review | 12.1 | Pending | **Estimated total tasks:** 24