26ff8d9b4f
Set up repository with legacy .NET Framework 4.8 source (OLD/), new .NET 10 Blazor solution (NEW/), OpenSpec specifications, documentation, and project configuration.
376 lines
13 KiB
Markdown
376 lines
13 KiB
Markdown
# Bulk Merge Helper Design
|
|
|
|
**Date:** 2026-01-01
|
|
**Status:** Draft - Pending Review
|
|
|
|
## Overview
|
|
|
|
Replace the current `StagingTableManager` approach with a streamlined `IBulkMergeHelper` backed by source-generated `IDataReader` converters for efficient `SqlBulkCopy` operations.
|
|
|
|
## Goals
|
|
|
|
1. Simplify bulk merge operations to a single method call with expression-based configuration
|
|
2. Generate efficient `IAsyncEnumerable<T>` to `IDataReader` converters at compile time
|
|
3. Provide better error diagnostics with optional pre-validation
|
|
4. Remove manual staging table management code
|
|
|
|
## Architecture
|
|
|
|
```
|
|
┌─────────────────────────────────────────────────────────────────┐
|
|
│ JdeScoping.DataSync │
|
|
│ ┌─────────────────────┐ ┌─────────────────────────────────┐ │
|
|
│ │ BulkCopyTypeRegistry│ │ IBulkMergeHelper │ │
|
|
│ │ - Lists types to │ │ - MergeAsync<T>(...) │ │
|
|
│ │ generate for │ │ - Uses IDataReaderFactory │ │
|
|
│ └─────────────────────┘ │ - Builds MERGE SQL from exprs │ │
|
|
│ │ └─────────────────────────────────┘ │
|
|
│ │ (analyzed by) │ │
|
|
│ ▼ │ (uses) │
|
|
│ ┌─────────────────────┐ ▼ │
|
|
│ │ Source Generator │ ┌─────────────────────────────────┐ │
|
|
│ │ - Generates │───▶│ Generated Code: │ │
|
|
│ │ IDataReader │ │ - WorkOrderDataReader │ │
|
|
│ │ wrappers │ │ - LotDataReader │ │
|
|
│ │ - Generates DI │ │ - DataReaderFactory impl │ │
|
|
│ │ registration │ │ - AddBulkCopyConverters() │ │
|
|
│ └─────────────────────┘ └─────────────────────────────────┘ │
|
|
└─────────────────────────────────────────────────────────────────┘
|
|
```
|
|
|
|
## Design Decisions
|
|
|
|
| Decision | Choice | Rationale |
|
|
|----------|--------|-----------|
|
|
| Type identification | Explicit list in `BulkCopyTypeRegistry.cs` | Keeps Core project free of bulk copy concerns |
|
|
| Registry location | DataSync project | Consolidates bulk copy knowledge in one place |
|
|
| API style | Single method with expression parameters | Simple, all config visible in one place |
|
|
| Conditional updates | Explicit `updateWhen` expression | Flexible, not tied to property naming conventions |
|
|
| Error handling | Hybrid - context wrapping + optional validation | Balances performance with debuggability |
|
|
| Transactions | None - each batch independent | Matches current behavior, idempotent syncs |
|
|
| Generator project | Single `JdeScoping.DataSync.SourceGenerators` | Simple, can extract later if needed |
|
|
| DI pattern | Generic `IDataReaderFactory` | Single injection point, easy to mock |
|
|
| DELETE support | None | YAGNI, matches current behavior |
|
|
| Migration config | Convention + override | Less boilerplate, explicit when needed |
|
|
|
|
## Component Details
|
|
|
|
### 1. BulkCopyTypeRegistry
|
|
|
|
Location: `JdeScoping.DataSync/BulkCopyTypeRegistry.cs`
|
|
|
|
```csharp
|
|
namespace JdeScoping.DataSync;
|
|
|
|
public static class BulkCopyTypeRegistry
|
|
{
|
|
public static readonly Type[] Types =
|
|
[
|
|
typeof(WorkOrder),
|
|
typeof(Lot),
|
|
typeof(LotUsage),
|
|
typeof(Item),
|
|
typeof(WorkCenter),
|
|
typeof(ProfitCenter),
|
|
typeof(JdeUser),
|
|
typeof(Branch),
|
|
typeof(MisData),
|
|
];
|
|
}
|
|
```
|
|
|
|
### 2. Source Generator
|
|
|
|
Project: `JdeScoping.DataSync.SourceGenerators`
|
|
|
|
**Generated DataReader wrapper (per type):**
|
|
```csharp
|
|
public sealed class WorkOrderDataReader : IDataReader
|
|
{
|
|
private readonly IAsyncEnumerator<WorkOrder> _enumerator;
|
|
private WorkOrder? _current;
|
|
|
|
private static readonly string[] _columnNames =
|
|
["WorkOrderNumber", "BranchCode", "LotNumber", ...];
|
|
|
|
public object GetValue(int i) => i switch
|
|
{
|
|
0 => _current!.WorkOrderNumber,
|
|
1 => _current!.BranchCode,
|
|
// ... generated for each property
|
|
};
|
|
|
|
public bool Read()
|
|
{
|
|
return _enumerator.MoveNextAsync().AsTask().GetAwaiter().GetResult();
|
|
}
|
|
|
|
// IDataReader implementation...
|
|
}
|
|
```
|
|
|
|
**Generated factory:**
|
|
```csharp
|
|
public sealed class DataReaderFactory : IDataReaderFactory
|
|
{
|
|
public IDataReader CreateReader<T>(IAsyncEnumerable<T> source)
|
|
{
|
|
return source switch
|
|
{
|
|
IAsyncEnumerable<WorkOrder> wo => new WorkOrderDataReader(wo),
|
|
IAsyncEnumerable<Lot> lot => new LotDataReader(lot),
|
|
_ => throw new NotSupportedException($"No converter for {typeof(T).Name}")
|
|
};
|
|
}
|
|
}
|
|
```
|
|
|
|
**Generated DI extension:**
|
|
```csharp
|
|
public static class BulkCopyServiceCollectionExtensions
|
|
{
|
|
public static IServiceCollection AddBulkCopyConverters(this IServiceCollection services)
|
|
{
|
|
services.AddSingleton<IDataReaderFactory, DataReaderFactory>();
|
|
return services;
|
|
}
|
|
}
|
|
```
|
|
|
|
### 3. IBulkMergeHelper Interface
|
|
|
|
Location: `JdeScoping.DataSync/Contracts/IBulkMergeHelper.cs`
|
|
|
|
```csharp
|
|
namespace JdeScoping.DataSync.Contracts;
|
|
|
|
public interface IBulkMergeHelper
|
|
{
|
|
Task<MergeResult> MergeAsync<T>(
|
|
IAsyncEnumerable<T> data,
|
|
string destinationTable,
|
|
Expression<Func<T, object>> matchOn,
|
|
Expression<Func<T, object>>? updateColumns = null,
|
|
Expression<Func<T, T, bool>>? updateWhen = null,
|
|
Expression<Func<T, object>>? insertColumns = null,
|
|
string? tempTableName = null,
|
|
int batchSize = 0,
|
|
bool validateBeforeCopy = false,
|
|
CancellationToken cancellationToken = default);
|
|
}
|
|
|
|
public record MergeResult(
|
|
int TotalRowsProcessed,
|
|
int RowsInserted,
|
|
int RowsUpdated,
|
|
int BatchCount,
|
|
TimeSpan Elapsed);
|
|
```
|
|
|
|
**Parameters:**
|
|
|
|
| Parameter | Purpose | Default |
|
|
|-----------|---------|---------|
|
|
| `data` | Source records to merge | required |
|
|
| `destinationTable` | Target SQL table name | required |
|
|
| `matchOn` | PK expression for MERGE ON clause | required |
|
|
| `updateColumns` | Columns to SET on match | null = all non-PK |
|
|
| `updateWhen` | Condition for UPDATE | null = always update |
|
|
| `insertColumns` | Columns for INSERT | null = all columns |
|
|
| `tempTableName` | Staging table name | `#TEMP_{table}` |
|
|
| `batchSize` | Rows per batch | 0 = all at once |
|
|
| `validateBeforeCopy` | Pre-validate data against schema | false |
|
|
|
|
### 4. BulkMergeHelper Implementation
|
|
|
|
Location: `JdeScoping.DataSync/Services/BulkMergeHelper.cs`
|
|
|
|
**Processing flow:**
|
|
```
|
|
1. Parse expressions → extract column names
|
|
matchOn: x => new { x.A, x.B } → ["A", "B"]
|
|
|
|
2. Get destination table schema (for temp table creation)
|
|
SELECT TOP 0 * FROM WorkOrder → column types/lengths
|
|
|
|
3. Create temp table matching destination schema
|
|
CREATE TABLE #TEMP_WorkOrder (... same columns ...)
|
|
|
|
4. If validateBeforeCopy: load schema constraints
|
|
|
|
5. Stream data in batches:
|
|
foreach batch:
|
|
a. Collect batchSize records from IAsyncEnumerable
|
|
b. If validate: check each row against schema
|
|
c. Create IDataReader via IDataReaderFactory
|
|
d. SqlBulkCopy to temp table
|
|
e. Execute MERGE statement
|
|
f. TRUNCATE temp table
|
|
g. Accumulate inserted/updated counts
|
|
|
|
6. DROP temp table (in finally block)
|
|
|
|
7. Return MergeResult with totals
|
|
```
|
|
|
|
**Generated MERGE SQL:**
|
|
```sql
|
|
MERGE INTO [WorkOrder] AS target
|
|
USING [#TEMP_WorkOrder] AS source
|
|
ON target.[WorkOrderNumber] = source.[WorkOrderNumber]
|
|
AND target.[BranchCode] = source.[BranchCode]
|
|
|
|
WHEN MATCHED AND source.[LastUpdateDt] > target.[LastUpdateDt] THEN
|
|
UPDATE SET
|
|
target.[StatusCode] = source.[StatusCode],
|
|
target.[OrderQuantity] = source.[OrderQuantity],
|
|
target.[LastUpdateDt] = source.[LastUpdateDt]
|
|
|
|
WHEN NOT MATCHED THEN
|
|
INSERT ([WorkOrderNumber], [BranchCode], [StatusCode], ...)
|
|
VALUES (source.[WorkOrderNumber], source.[BranchCode], ...);
|
|
|
|
SELECT @@ROWCOUNT;
|
|
```
|
|
|
|
### 5. Error Handling
|
|
|
|
**Exception hierarchy:**
|
|
```csharp
|
|
public class BulkMergeException : Exception
|
|
{
|
|
public string TableName { get; init; }
|
|
public int BatchNumber { get; init; }
|
|
public int RowsInBatch { get; init; }
|
|
public string? SqlStatement { get; init; }
|
|
}
|
|
|
|
public class BulkMergeValidationException : BulkMergeException
|
|
{
|
|
public IReadOnlyList<ValidationError> Errors { get; init; }
|
|
}
|
|
|
|
public record ValidationError(
|
|
int RowIndex,
|
|
string ColumnName,
|
|
object? Value,
|
|
string Message);
|
|
```
|
|
|
|
**Validation checks (when `validateBeforeCopy: true`):**
|
|
|
|
| Check | Example Error |
|
|
|-------|---------------|
|
|
| String length | `"Column 'StatusCode' value 'TOOLONG' exceeds max length 5 at row 42"` |
|
|
| Null in non-nullable | `"Column 'WorkOrderNumber' cannot be null at row 17"` |
|
|
| Type mismatch | `"Column 'OrderQuantity' expected int, got string at row 89"` |
|
|
| Decimal precision | `"Column 'Amount' value 12345.6789 exceeds precision(10,2) at row 5"` |
|
|
|
|
### 6. DI Registration
|
|
|
|
```csharp
|
|
public static class ServiceCollectionExtensions
|
|
{
|
|
public static IServiceCollection AddDataSync(
|
|
this IServiceCollection services,
|
|
IConfiguration configuration)
|
|
{
|
|
// Existing registrations...
|
|
|
|
// Add bulk copy converters (generated)
|
|
services.AddBulkCopyConverters();
|
|
|
|
// Add bulk merge helper
|
|
services.AddScoped<IBulkMergeHelper, BulkMergeHelper>();
|
|
|
|
return services;
|
|
}
|
|
}
|
|
```
|
|
|
|
## Testing Strategy
|
|
|
|
### Unit Tests (JdeScoping.DataSync.Tests)
|
|
|
|
| Test Class | Coverage |
|
|
|------------|----------|
|
|
| `BulkMergeHelperTests` | Expression parsing, SQL generation, batch splitting |
|
|
| `ExpressionParserTests` | Column name extraction from expressions |
|
|
| `MergeSqlBuilderTests` | Generated MERGE SQL correctness |
|
|
| `DataReaderFactoryTests` | Factory type resolution |
|
|
| `ValidationTests` | Schema validation logic |
|
|
| `BulkMergeExceptionTests` | Exception properties and formatting |
|
|
|
|
### Integration Tests (JdeScoping.DataSync.IntegrationTests)
|
|
|
|
| Test Class | Coverage |
|
|
|------------|----------|
|
|
| `BulkMergeHelperIntegrationTests` | End-to-end merge against SQL Server |
|
|
| `BatchingIntegrationTests` | Large datasets, multiple batches |
|
|
| `ValidationIntegrationTests` | Schema validation against real table |
|
|
|
|
**Key scenarios:**
|
|
- Insert new records (WHEN NOT MATCHED)
|
|
- Update existing records (WHEN MATCHED)
|
|
- Conditional update respects `updateWhen`
|
|
- Composite primary key matching
|
|
- Batch processing (10k+ records across multiple batches)
|
|
- Temp table cleanup on success and failure
|
|
- Validation catches truncation before SQL error
|
|
|
|
## Migration Plan
|
|
|
|
### Code to Replace
|
|
|
|
| File | Change |
|
|
|------|--------|
|
|
| `StagingTableManager.cs` | Delete - replaced by `BulkMergeHelper` |
|
|
| `TableSyncOperation.cs` | Simplify to use `IBulkMergeHelper` |
|
|
| `LotFinderRepository.DataSync.cs` | Remove bulk-related methods |
|
|
|
|
### Before/After
|
|
|
|
**Before:**
|
|
```csharp
|
|
await _stagingTableManager.CreateStagingTableAsync(...);
|
|
await _stagingTableManager.BulkCopyToStagingAsync(...);
|
|
await _stagingTableManager.MergeFromStagingAsync(...);
|
|
await _stagingTableManager.DropStagingTableAsync(...);
|
|
```
|
|
|
|
**After:**
|
|
```csharp
|
|
var result = await _bulkMergeHelper.MergeAsync(
|
|
data: fetcher.FetchAsync(lastUpdate),
|
|
destinationTable: config.TableName,
|
|
matchOn: config.MatchExpression,
|
|
updateColumns: config.UpdateExpression,
|
|
updateWhen: config.UpdateCondition,
|
|
batchSize: _options.BatchSize);
|
|
```
|
|
|
|
### Tests to Remove
|
|
|
|
- `StagingTableManagerTests.cs` (unit)
|
|
- `StagingTableManagerTests.cs` (integration)
|
|
|
|
## Example Usage
|
|
|
|
```csharp
|
|
// Simple case - match on single PK, update all columns
|
|
var result = await _bulkMergeHelper.MergeAsync(
|
|
data: workOrders,
|
|
destinationTable: "WorkOrder",
|
|
matchOn: x => x.WorkOrderNumber);
|
|
|
|
// Full configuration
|
|
var result = await _bulkMergeHelper.MergeAsync(
|
|
data: workOrders,
|
|
destinationTable: "WorkOrder",
|
|
matchOn: x => new { x.WorkOrderNumber, x.BranchCode },
|
|
updateColumns: x => new { x.StatusCode, x.OrderQuantity, x.LastUpdateDt },
|
|
updateWhen: (src, tgt) => src.LastUpdateDt > tgt.LastUpdateDt,
|
|
batchSize: 10000,
|
|
validateBeforeCopy: true);
|
|
```
|