# Bulk Merge Helper Implementation Plan
**Date:** 2026-01-01
**Design:** [2026-01-01-bulk-merge-helper-design.md](./2026-01-01-bulk-merge-helper-design.md)
**Status:** Draft - Pending Review
## Prerequisites
- .NET 10 SDK installed
- SQL Server running (Docker container for tests)
- Existing DataSync project compiles
## Phase 1: Source Generator Project Setup
### Task 1.1: Create Source Generator Project
**Location:** `NEW/src/JdeScoping.DataSync.SourceGenerators/`
**Files to create:**
1. `JdeScoping.DataSync.SourceGenerators.csproj`
```xml
netstandard2.0
latest
true
true
```
2. `DataReaderGenerator.cs` - Main incremental source generator
**Verification:** Project compiles with `dotnet build`
### Task 1.2: Add Generator Reference to DataSync
**File:** `NEW/src/JdeScoping.DataSync/JdeScoping.DataSync.csproj`
**Add:**
```xml
```
**Verification:** DataSync project compiles
---
## Phase 2: Core Interfaces and Contracts
### Task 2.1: Create IDataReaderFactory Interface
**File:** `NEW/src/JdeScoping.DataSync/Contracts/IDataReaderFactory.cs`
```csharp
namespace JdeScoping.DataSync.Contracts;
public interface IDataReaderFactory
{
IDataReader CreateReader(IAsyncEnumerable source);
IReadOnlyList GetColumnNames();
}
```
**Verification:** Compiles
### Task 2.2: Create IBulkMergeHelper Interface
**File:** `NEW/src/JdeScoping.DataSync/Contracts/IBulkMergeHelper.cs`
```csharp
namespace JdeScoping.DataSync.Contracts;
public interface IBulkMergeHelper
{
Task MergeAsync(
IAsyncEnumerable data,
string destinationTable,
Expression> matchOn,
Expression>? updateColumns = null,
Expression>? updateWhen = null,
Expression>? insertColumns = null,
string? tempTableName = null,
int batchSize = 0,
bool validateBeforeCopy = false,
CancellationToken cancellationToken = default);
}
```
**Verification:** Compiles
### Task 2.3: Create MergeResult Record
**File:** `NEW/src/JdeScoping.DataSync/Models/MergeResult.cs`
```csharp
namespace JdeScoping.DataSync.Models;
public record MergeResult(
int TotalRowsProcessed,
int RowsInserted,
int RowsUpdated,
int BatchCount,
TimeSpan Elapsed);
```
**Verification:** Compiles
### Task 2.4: Create Exception Classes
**File:** `NEW/src/JdeScoping.DataSync/Exceptions/BulkMergeException.cs`
```csharp
namespace JdeScoping.DataSync.Exceptions;
public class BulkMergeException : Exception
{
public string TableName { get; init; } = string.Empty;
public int BatchNumber { get; init; }
public int RowsInBatch { get; init; }
public string? SqlStatement { get; init; }
// constructors...
}
public class BulkMergeValidationException : BulkMergeException
{
public IReadOnlyList Errors { get; init; } = [];
}
public record ValidationError(
int RowIndex,
string ColumnName,
object? Value,
string Message);
```
**Verification:** Compiles
---
## Phase 3: Type Registry and Generator Implementation
### Task 3.1: Create BulkCopyTypeRegistry
**File:** `NEW/src/JdeScoping.DataSync/BulkCopyTypeRegistry.cs`
```csharp
namespace JdeScoping.DataSync;
public static class BulkCopyTypeRegistry
{
public static readonly Type[] Types =
[
typeof(WorkOrder),
typeof(Lot),
typeof(LotUsage),
typeof(Item),
typeof(WorkCenter),
typeof(ProfitCenter),
typeof(JdeUser),
typeof(Branch),
typeof(MisData),
];
}
```
**Verification:** Compiles with correct type references
### Task 3.2: Implement DataReaderGenerator
**File:** `NEW/src/JdeScoping.DataSync.SourceGenerators/DataReaderGenerator.cs`
Generator must:
1. Find `BulkCopyTypeRegistry.Types` array in compilation
2. For each type, generate a `{TypeName}DataReader : IDataReader` class
3. Generate `DataReaderFactory` implementation
4. Generate `AddBulkCopyConverters()` extension method
**Key implementation details:**
- Use incremental generator (`IIncrementalGenerator`) for performance
- Handle nullable properties correctly (use `DBNull.Value` for null)
- Skip properties with private setters
- Order columns alphabetically for consistency
**Verification:**
- Generator compiles
- DataSync builds and generated code appears in `obj/Generated/`
### Task 3.3: Write Generator Unit Tests
**File:** `NEW/tests/JdeScoping.DataSync.Tests/SourceGenerators/DataReaderGeneratorTests.cs`
Test scenarios:
- Generates reader for simple type
- Generates factory with all registered types
- Handles nullable properties
- Skips private properties
- Generates correct column ordinal mapping
**Verification:** All generator tests pass
---
## Phase 4: Expression Parsing
### Task 4.1: Create ExpressionParser
**File:** `NEW/src/JdeScoping.DataSync/Services/ExpressionParser.cs`
```csharp
namespace JdeScoping.DataSync.Services;
internal static class ExpressionParser
{
public static IReadOnlyList GetColumnNames(
Expression> expression);
public static string BuildUpdateWhenSql(
Expression>? expression,
string sourceAlias,
string targetAlias);
}
```
**Handles:**
- Single property: `x => x.Id` → `["Id"]`
- Anonymous type: `x => new { x.A, x.B }` → `["A", "B"]`
- Comparison expressions for `updateWhen`
**Verification:** Compiles
### Task 4.2: Write ExpressionParser Unit Tests
**File:** `NEW/tests/JdeScoping.DataSync.Tests/Services/ExpressionParserTests.cs`
Test scenarios:
- Single property extraction
- Multiple properties via anonymous type
- Nested property access throws helpful error
- Comparison expression SQL generation
- Complex boolean expressions (AND, OR)
**Verification:** All tests pass
---
## Phase 5: SQL Builder
### Task 5.1: Create MergeSqlBuilder
**File:** `NEW/src/JdeScoping.DataSync/Services/MergeSqlBuilder.cs`
```csharp
namespace JdeScoping.DataSync.Services;
internal static class MergeSqlBuilder
{
public static string BuildCreateTempTable(
string tempTableName,
string sourceTableName);
public static string BuildMerge(
string destinationTable,
string tempTableName,
IReadOnlyList matchColumns,
IReadOnlyList updateColumns,
string? updateWhenClause,
IReadOnlyList insertColumns);
}
```
**Verification:** Compiles
### Task 5.2: Write MergeSqlBuilder Unit Tests
**File:** `NEW/tests/JdeScoping.DataSync.Tests/Services/MergeSqlBuilderTests.cs`
Test scenarios:
- Creates temp table with SELECT INTO
- MERGE with single match column
- MERGE with composite key
- MERGE with updateWhen condition
- MERGE with subset of update columns
- MERGE with all columns for insert
- Proper SQL escaping of column names
**Verification:** All tests pass
---
## Phase 6: Schema Validation
### Task 6.1: Create SchemaValidator
**File:** `NEW/src/JdeScoping.DataSync/Services/SchemaValidator.cs`
```csharp
namespace JdeScoping.DataSync.Services;
internal sealed class SchemaValidator
{
public async Task LoadSchemaAsync(
SqlConnection connection,
string tableName);
public IReadOnlyList Validate(
IEnumerable rows,
TableSchema schema,
IReadOnlyList columnNames);
}
internal record TableSchema(
IReadOnlyDictionary Columns);
internal record ColumnSchema(
string Name,
Type ClrType,
bool IsNullable,
int? MaxLength,
byte? Precision,
byte? Scale);
```
**Verification:** Compiles
### Task 6.2: Write SchemaValidator Unit Tests
**File:** `NEW/tests/JdeScoping.DataSync.Tests/Services/SchemaValidatorTests.cs`
Test scenarios:
- Detects string exceeding max length
- Detects null in non-nullable column
- Detects decimal precision overflow
- Returns multiple errors for row
- Includes row index in errors
**Verification:** All tests pass
---
## Phase 7: BulkMergeHelper Implementation
### Task 7.1: Implement BulkMergeHelper
**File:** `NEW/src/JdeScoping.DataSync/Services/BulkMergeHelper.cs`
```csharp
namespace JdeScoping.DataSync.Services;
public sealed class BulkMergeHelper : IBulkMergeHelper
{
private readonly IDataReaderFactory _readerFactory;
private readonly IDbConnectionFactory _connectionFactory;
private readonly ILogger _logger;
private readonly DataSyncOptions _options;
public async Task MergeAsync(...) { ... }
}
```
**Implementation flow:**
1. Parse expressions
2. Open connection
3. Create temp table
4. Loop: batch → validate? → bulk copy → merge → truncate
5. Finally: drop temp table
6. Return result
**Verification:** Compiles
### Task 7.2: Write BulkMergeHelper Unit Tests
**File:** `NEW/tests/JdeScoping.DataSync.Tests/Services/BulkMergeHelperTests.cs`
Test scenarios (use mocks):
- Calls factory to create reader
- Builds correct SQL from expressions
- Handles empty data source
- Respects batch size
- Wraps SqlException with context
- Invokes validation when flag set
- Drops temp table on failure
**Verification:** All tests pass
---
## Phase 8: DI Registration
### Task 8.1: Update ServiceCollectionExtensions
**File:** `NEW/src/JdeScoping.DataSync/DependencyInjection/ServiceCollectionExtensions.cs`
**Add to existing method:**
```csharp
// Add bulk copy converters (generated)
services.AddBulkCopyConverters();
// Add bulk merge helper
services.AddScoped();
```
**Verification:** Compiles, DI container builds correctly
---
## Phase 9: Integration Tests
### Task 9.1: Create BulkMergeHelper Integration Tests
**File:** `NEW/tests/JdeScoping.DataSync.IntegrationTests/Services/BulkMergeHelperIntegrationTests.cs`
Test scenarios:
- Inserts new records to empty table
- Updates existing records
- Conditional update respects updateWhen
- Composite primary key matching works
- Handles 10k+ records
- Temp table cleaned up on success
- Temp table cleaned up on failure
**Verification:** All integration tests pass
### Task 9.2: Create Batching Integration Tests
**File:** `NEW/tests/JdeScoping.DataSync.IntegrationTests/Services/BatchingIntegrationTests.cs`
Test scenarios:
- Processes 50k records in batches of 10k
- Each batch commits independently
- Partial failure leaves earlier batches committed
- Result contains correct batch count
**Verification:** All tests pass
### Task 9.3: Create Validation Integration Tests
**File:** `NEW/tests/JdeScoping.DataSync.IntegrationTests/Services/ValidationIntegrationTests.cs`
Test scenarios:
- Validation catches string truncation
- Validation catches null violation
- Validation error includes row details
- Without validation, gets SqlException
**Verification:** All tests pass
---
## Phase 10: Migration - Update Existing Code
### Task 10.1: Update TableSyncOperation
**File:** `NEW/src/JdeScoping.DataSync/Services/TableSyncOperation.cs`
**Changes:**
- Inject `IBulkMergeHelper` instead of `IStagingTableManager`
- Replace staging table calls with single `MergeAsync` call
- Update mass update path to use `MergeAsync` with `batchSize: 0`
- Keep post-processor invocation
**Verification:** Compiles
### Task 10.2: Update DataSourceConfig for Expressions
**File:** `NEW/src/JdeScoping.DataSync/Configuration/DataSourceConfig.cs`
**Consider:** How to store/configure match/update expressions per table.
Options:
1. Each fetcher returns its merge config
2. Convention: use primary key for match, all columns for update
3. Attribute on model classes (rejected - Core stays clean)
**Recommended:** Convention with optional override in fetcher.
**Verification:** Compiles
### Task 10.3: Update TableSyncOperation Tests
**File:** `NEW/tests/JdeScoping.DataSync.Tests/Services/TableSyncOperationTests.cs`
**Changes:**
- Mock `IBulkMergeHelper` instead of `IStagingTableManager`
- Update assertions for new call patterns
**Verification:** All tests pass
---
## Phase 11: Cleanup
### Task 11.1: Remove Old Bulk Merge Code
**Files to delete:**
- `NEW/src/JdeScoping.DataSync/Contracts/IStagingTableManager.cs`
- `NEW/src/JdeScoping.DataSync/Services/StagingTableManager.cs`
- `NEW/tests/JdeScoping.DataSync.Tests/Services/StagingTableManagerTests.cs`
- `NEW/tests/JdeScoping.DataSync.IntegrationTests/Services/StagingTableManagerTests.cs`
**Files to update:**
- `NEW/src/JdeScoping.DataSync/DependencyInjection/ServiceCollectionExtensions.cs` - Remove `IStagingTableManager` registration
- `NEW/src/JdeScoping.Data/Repositories/LotFinderRepository.DataSync.cs` - Remove unused bulk methods if any
**Verification:**
- Solution compiles
- All tests pass
- No references to deleted types
### Task 11.2: Final Verification
**Commands:**
```bash
dotnet build
dotnet test
```
**Verification:**
- Zero build warnings related to new code
- All tests pass
- Integration tests pass against SQL Server
---
## Phase 12: Codex Review
### Task 12.1: Consult Codex for Gaps
Use Codex MCP to review:
- Generated code efficiency
- Missing edge cases
- Performance considerations for large datasets
- Error handling completeness
- Thread safety concerns
**Verification:** Address any issues found
---
## Summary Checklist
| Phase | Tasks | Status |
|-------|-------|--------|
| 1. Generator Project | 1.1-1.2 | Pending |
| 2. Contracts | 2.1-2.4 | Pending |
| 3. Type Registry & Generator | 3.1-3.3 | Pending |
| 4. Expression Parsing | 4.1-4.2 | Pending |
| 5. SQL Builder | 5.1-5.2 | Pending |
| 6. Schema Validation | 6.1-6.2 | Pending |
| 7. BulkMergeHelper | 7.1-7.2 | Pending |
| 8. DI Registration | 8.1 | Pending |
| 9. Integration Tests | 9.1-9.3 | Pending |
| 10. Migration | 10.1-10.3 | Pending |
| 11. Cleanup | 11.1-11.2 | Pending |
| 12. Codex Review | 12.1 | Pending |
**Estimated total tasks:** 24