Files
jdescopingtool/PLANS/2026-01-01-bulk-merge-helper-implementation.md
T
Joseph Doherty 26ff8d9b4f Initial commit: JDE Scoping Tool migration project
Set up repository with legacy .NET Framework 4.8 source (OLD/),
new .NET 10 Blazor solution (NEW/), OpenSpec specifications,
documentation, and project configuration.
2026-01-02 07:43:29 -05:00

15 KiB

Bulk Merge Helper Implementation Plan

Date: 2026-01-01 Design: 2026-01-01-bulk-merge-helper-design.md Status: Draft - Pending Review

Prerequisites

  • .NET 10 SDK installed
  • SQL Server running (Docker container for tests)
  • Existing DataSync project compiles

Phase 1: Source Generator Project Setup

Task 1.1: Create Source Generator Project

Location: NEW/src/JdeScoping.DataSync.SourceGenerators/

Files to create:

  1. JdeScoping.DataSync.SourceGenerators.csproj
<Project Sdk="Microsoft.NET.Sdk">
  <PropertyGroup>
    <TargetFramework>netstandard2.0</TargetFramework>
    <LangVersion>latest</LangVersion>
    <EnforceExtendedAnalyzerRules>true</EnforceExtendedAnalyzerRules>
    <IsRoslynComponent>true</IsRoslynComponent>
  </PropertyGroup>
  <ItemGroup>
    <PackageReference Include="Microsoft.CodeAnalysis.Analyzers" Version="3.3.4" PrivateAssets="all" />
    <PackageReference Include="Microsoft.CodeAnalysis.CSharp" Version="4.8.0" PrivateAssets="all" />
  </ItemGroup>
</Project>
  1. DataReaderGenerator.cs - Main incremental source generator

Verification: Project compiles with dotnet build

Task 1.2: Add Generator Reference to DataSync

File: NEW/src/JdeScoping.DataSync/JdeScoping.DataSync.csproj

Add:

<ItemGroup>
  <ProjectReference Include="..\JdeScoping.DataSync.SourceGenerators\JdeScoping.DataSync.SourceGenerators.csproj"
                    OutputItemType="Analyzer"
                    ReferenceOutputAssembly="false" />
</ItemGroup>

Verification: DataSync project compiles


Phase 2: Core Interfaces and Contracts

Task 2.1: Create IDataReaderFactory Interface

File: NEW/src/JdeScoping.DataSync/Contracts/IDataReaderFactory.cs

namespace JdeScoping.DataSync.Contracts;

public interface IDataReaderFactory
{
    IDataReader CreateReader<T>(IAsyncEnumerable<T> source);

    IReadOnlyList<string> GetColumnNames<T>();
}

Verification: Compiles

Task 2.2: Create IBulkMergeHelper Interface

File: NEW/src/JdeScoping.DataSync/Contracts/IBulkMergeHelper.cs

namespace JdeScoping.DataSync.Contracts;

public interface IBulkMergeHelper
{
    Task<MergeResult> MergeAsync<T>(
        IAsyncEnumerable<T> data,
        string destinationTable,
        Expression<Func<T, object>> matchOn,
        Expression<Func<T, object>>? updateColumns = null,
        Expression<Func<T, T, bool>>? updateWhen = null,
        Expression<Func<T, object>>? insertColumns = null,
        string? tempTableName = null,
        int batchSize = 0,
        bool validateBeforeCopy = false,
        CancellationToken cancellationToken = default);
}

Verification: Compiles

Task 2.3: Create MergeResult Record

File: NEW/src/JdeScoping.DataSync/Models/MergeResult.cs

namespace JdeScoping.DataSync.Models;

public record MergeResult(
    int TotalRowsProcessed,
    int RowsInserted,
    int RowsUpdated,
    int BatchCount,
    TimeSpan Elapsed);

Verification: Compiles

Task 2.4: Create Exception Classes

File: NEW/src/JdeScoping.DataSync/Exceptions/BulkMergeException.cs

namespace JdeScoping.DataSync.Exceptions;

public class BulkMergeException : Exception
{
    public string TableName { get; init; } = string.Empty;
    public int BatchNumber { get; init; }
    public int RowsInBatch { get; init; }
    public string? SqlStatement { get; init; }

    // constructors...
}

public class BulkMergeValidationException : BulkMergeException
{
    public IReadOnlyList<ValidationError> Errors { get; init; } = [];
}

public record ValidationError(
    int RowIndex,
    string ColumnName,
    object? Value,
    string Message);

Verification: Compiles


Phase 3: Type Registry and Generator Implementation

Task 3.1: Create BulkCopyTypeRegistry

File: NEW/src/JdeScoping.DataSync/BulkCopyTypeRegistry.cs

namespace JdeScoping.DataSync;

public static class BulkCopyTypeRegistry
{
    public static readonly Type[] Types =
    [
        typeof(WorkOrder),
        typeof(Lot),
        typeof(LotUsage),
        typeof(Item),
        typeof(WorkCenter),
        typeof(ProfitCenter),
        typeof(JdeUser),
        typeof(Branch),
        typeof(MisData),
    ];
}

Verification: Compiles with correct type references

Task 3.2: Implement DataReaderGenerator

File: NEW/src/JdeScoping.DataSync.SourceGenerators/DataReaderGenerator.cs

Generator must:

  1. Find BulkCopyTypeRegistry.Types array in compilation
  2. For each type, generate a {TypeName}DataReader : IDataReader class
  3. Generate DataReaderFactory implementation
  4. Generate AddBulkCopyConverters() extension method

Key implementation details:

  • Use incremental generator (IIncrementalGenerator) for performance
  • Handle nullable properties correctly (use DBNull.Value for null)
  • Skip properties with private setters
  • Order columns alphabetically for consistency

Verification:

  • Generator compiles
  • DataSync builds and generated code appears in obj/Generated/

Task 3.3: Write Generator Unit Tests

File: NEW/tests/JdeScoping.DataSync.Tests/SourceGenerators/DataReaderGeneratorTests.cs

Test scenarios:

  • Generates reader for simple type
  • Generates factory with all registered types
  • Handles nullable properties
  • Skips private properties
  • Generates correct column ordinal mapping

Verification: All generator tests pass


Phase 4: Expression Parsing

Task 4.1: Create ExpressionParser

File: NEW/src/JdeScoping.DataSync/Services/ExpressionParser.cs

namespace JdeScoping.DataSync.Services;

internal static class ExpressionParser
{
    public static IReadOnlyList<string> GetColumnNames<T>(
        Expression<Func<T, object>> expression);

    public static string BuildUpdateWhenSql<T>(
        Expression<Func<T, T, bool>>? expression,
        string sourceAlias,
        string targetAlias);
}

Handles:

  • Single property: x => x.Id["Id"]
  • Anonymous type: x => new { x.A, x.B }["A", "B"]
  • Comparison expressions for updateWhen

Verification: Compiles

Task 4.2: Write ExpressionParser Unit Tests

File: NEW/tests/JdeScoping.DataSync.Tests/Services/ExpressionParserTests.cs

Test scenarios:

  • Single property extraction
  • Multiple properties via anonymous type
  • Nested property access throws helpful error
  • Comparison expression SQL generation
  • Complex boolean expressions (AND, OR)

Verification: All tests pass


Phase 5: SQL Builder

Task 5.1: Create MergeSqlBuilder

File: NEW/src/JdeScoping.DataSync/Services/MergeSqlBuilder.cs

namespace JdeScoping.DataSync.Services;

internal static class MergeSqlBuilder
{
    public static string BuildCreateTempTable(
        string tempTableName,
        string sourceTableName);

    public static string BuildMerge(
        string destinationTable,
        string tempTableName,
        IReadOnlyList<string> matchColumns,
        IReadOnlyList<string> updateColumns,
        string? updateWhenClause,
        IReadOnlyList<string> insertColumns);
}

Verification: Compiles

Task 5.2: Write MergeSqlBuilder Unit Tests

File: NEW/tests/JdeScoping.DataSync.Tests/Services/MergeSqlBuilderTests.cs

Test scenarios:

  • Creates temp table with SELECT INTO
  • MERGE with single match column
  • MERGE with composite key
  • MERGE with updateWhen condition
  • MERGE with subset of update columns
  • MERGE with all columns for insert
  • Proper SQL escaping of column names

Verification: All tests pass


Phase 6: Schema Validation

Task 6.1: Create SchemaValidator

File: NEW/src/JdeScoping.DataSync/Services/SchemaValidator.cs

namespace JdeScoping.DataSync.Services;

internal sealed class SchemaValidator
{
    public async Task<TableSchema> LoadSchemaAsync(
        SqlConnection connection,
        string tableName);

    public IReadOnlyList<ValidationError> Validate<T>(
        IEnumerable<T> rows,
        TableSchema schema,
        IReadOnlyList<string> columnNames);
}

internal record TableSchema(
    IReadOnlyDictionary<string, ColumnSchema> Columns);

internal record ColumnSchema(
    string Name,
    Type ClrType,
    bool IsNullable,
    int? MaxLength,
    byte? Precision,
    byte? Scale);

Verification: Compiles

Task 6.2: Write SchemaValidator Unit Tests

File: NEW/tests/JdeScoping.DataSync.Tests/Services/SchemaValidatorTests.cs

Test scenarios:

  • Detects string exceeding max length
  • Detects null in non-nullable column
  • Detects decimal precision overflow
  • Returns multiple errors for row
  • Includes row index in errors

Verification: All tests pass


Phase 7: BulkMergeHelper Implementation

Task 7.1: Implement BulkMergeHelper

File: NEW/src/JdeScoping.DataSync/Services/BulkMergeHelper.cs

namespace JdeScoping.DataSync.Services;

public sealed class BulkMergeHelper : IBulkMergeHelper
{
    private readonly IDataReaderFactory _readerFactory;
    private readonly IDbConnectionFactory _connectionFactory;
    private readonly ILogger<BulkMergeHelper> _logger;
    private readonly DataSyncOptions _options;

    public async Task<MergeResult> MergeAsync<T>(...) { ... }
}

Implementation flow:

  1. Parse expressions
  2. Open connection
  3. Create temp table
  4. Loop: batch → validate? → bulk copy → merge → truncate
  5. Finally: drop temp table
  6. Return result

Verification: Compiles

Task 7.2: Write BulkMergeHelper Unit Tests

File: NEW/tests/JdeScoping.DataSync.Tests/Services/BulkMergeHelperTests.cs

Test scenarios (use mocks):

  • Calls factory to create reader
  • Builds correct SQL from expressions
  • Handles empty data source
  • Respects batch size
  • Wraps SqlException with context
  • Invokes validation when flag set
  • Drops temp table on failure

Verification: All tests pass


Phase 8: DI Registration

Task 8.1: Update ServiceCollectionExtensions

File: NEW/src/JdeScoping.DataSync/DependencyInjection/ServiceCollectionExtensions.cs

Add to existing method:

// Add bulk copy converters (generated)
services.AddBulkCopyConverters();

// Add bulk merge helper
services.AddScoped<IBulkMergeHelper, BulkMergeHelper>();

Verification: Compiles, DI container builds correctly


Phase 9: Integration Tests

Task 9.1: Create BulkMergeHelper Integration Tests

File: NEW/tests/JdeScoping.DataSync.IntegrationTests/Services/BulkMergeHelperIntegrationTests.cs

Test scenarios:

  • Inserts new records to empty table
  • Updates existing records
  • Conditional update respects updateWhen
  • Composite primary key matching works
  • Handles 10k+ records
  • Temp table cleaned up on success
  • Temp table cleaned up on failure

Verification: All integration tests pass

Task 9.2: Create Batching Integration Tests

File: NEW/tests/JdeScoping.DataSync.IntegrationTests/Services/BatchingIntegrationTests.cs

Test scenarios:

  • Processes 50k records in batches of 10k
  • Each batch commits independently
  • Partial failure leaves earlier batches committed
  • Result contains correct batch count

Verification: All tests pass

Task 9.3: Create Validation Integration Tests

File: NEW/tests/JdeScoping.DataSync.IntegrationTests/Services/ValidationIntegrationTests.cs

Test scenarios:

  • Validation catches string truncation
  • Validation catches null violation
  • Validation error includes row details
  • Without validation, gets SqlException

Verification: All tests pass


Phase 10: Migration - Update Existing Code

Task 10.1: Update TableSyncOperation

File: NEW/src/JdeScoping.DataSync/Services/TableSyncOperation.cs

Changes:

  • Inject IBulkMergeHelper instead of IStagingTableManager
  • Replace staging table calls with single MergeAsync call
  • Update mass update path to use MergeAsync with batchSize: 0
  • Keep post-processor invocation

Verification: Compiles

Task 10.2: Update DataSourceConfig for Expressions

File: NEW/src/JdeScoping.DataSync/Configuration/DataSourceConfig.cs

Consider: How to store/configure match/update expressions per table.

Options:

  1. Each fetcher returns its merge config
  2. Convention: use primary key for match, all columns for update
  3. Attribute on model classes (rejected - Core stays clean)

Recommended: Convention with optional override in fetcher.

Verification: Compiles

Task 10.3: Update TableSyncOperation Tests

File: NEW/tests/JdeScoping.DataSync.Tests/Services/TableSyncOperationTests.cs

Changes:

  • Mock IBulkMergeHelper instead of IStagingTableManager
  • Update assertions for new call patterns

Verification: All tests pass


Phase 11: Cleanup

Task 11.1: Remove Old Bulk Merge Code

Files to delete:

  • NEW/src/JdeScoping.DataSync/Contracts/IStagingTableManager.cs
  • NEW/src/JdeScoping.DataSync/Services/StagingTableManager.cs
  • NEW/tests/JdeScoping.DataSync.Tests/Services/StagingTableManagerTests.cs
  • NEW/tests/JdeScoping.DataSync.IntegrationTests/Services/StagingTableManagerTests.cs

Files to update:

  • NEW/src/JdeScoping.DataSync/DependencyInjection/ServiceCollectionExtensions.cs - Remove IStagingTableManager registration
  • NEW/src/JdeScoping.Data/Repositories/LotFinderRepository.DataSync.cs - Remove unused bulk methods if any

Verification:

  • Solution compiles
  • All tests pass
  • No references to deleted types

Task 11.2: Final Verification

Commands:

dotnet build
dotnet test

Verification:

  • Zero build warnings related to new code
  • All tests pass
  • Integration tests pass against SQL Server

Phase 12: Codex Review

Task 12.1: Consult Codex for Gaps

Use Codex MCP to review:

  • Generated code efficiency
  • Missing edge cases
  • Performance considerations for large datasets
  • Error handling completeness
  • Thread safety concerns

Verification: Address any issues found


Summary Checklist

Phase Tasks Status
1. Generator Project 1.1-1.2 Pending
2. Contracts 2.1-2.4 Pending
3. Type Registry & Generator 3.1-3.3 Pending
4. Expression Parsing 4.1-4.2 Pending
5. SQL Builder 5.1-5.2 Pending
6. Schema Validation 6.1-6.2 Pending
7. BulkMergeHelper 7.1-7.2 Pending
8. DI Registration 8.1 Pending
9. Integration Tests 9.1-9.3 Pending
10. Migration 10.1-10.3 Pending
11. Cleanup 11.1-11.2 Pending
12. Codex Review 12.1 Pending

Estimated total tasks: 24