# Tasks: Implement Data Sync Service ## Phase 1: Configuration and Interfaces - [x] Create JdeScoping.DataSync project - Create `NEW/src/JdeScoping.DataSync/JdeScoping.DataSync.csproj` - Add references to JdeScoping.Domain and JdeScoping.Database - Validation: Project compiles and is referenced by JdeScoping.Host - [x] Create DataSyncOptions configuration class - File: `Configuration/DataSyncOptions.cs` - Properties: CheckInterval, MaxDegreeOfParallelism, BatchSize, BulkCopyBatchSize, LookbackMultiplier, PurgeRetentionDays, DataSources - Validation: Options bind from appsettings.json DataSync section - [x] Create DataSourceConfig configuration class - File: `Configuration/DataSourceConfig.cs` - Properties: TableName, SourceSystem, FetcherTypeName, PostProcessorTypeName, IsEnabled, MassConfig, DailyConfig, HourlyConfig - Include ScheduleConfig nested class - Validation: Configuration parses correctly from JSON - [x] Create IDataFetcher interface - File: `Contracts/IDataFetcher.cs` - Method: `IAsyncEnumerable FetchAsync(DateTime? minimumDT, CancellationToken cancellationToken)` - Validation: Interface compiles with correct signature - [x] Create IPostProcessor interface - File: `Contracts/IPostProcessor.cs` - Method: `Task ProcessAsync(string tableName, CancellationToken cancellationToken)` - Validation: Interface compiles with correct signature - [x] Create supporting interfaces - Files: `Contracts/ISyncOrchestrator.cs`, `IScheduleChecker.cs`, `ITableSyncOperation.cs`, `IStagingTableManager.cs` - Validation: All interfaces compile ## Phase 2: Core Service Implementation - [x] Create DataSyncService (BackgroundService) - File: `DataSyncService.cs` - Implement ExecuteAsync with main sync loop - Inject IServiceScopeFactory, IOptions, ILogger - Call CloseOpenUpdateEntriesAsync at startup - Call PurgeUpdateEntriesAsync periodically - Respect CancellationToken throughout - Validation: Service starts with host and stops gracefully - [x] Create ScheduleChecker service - File: `Services/ScheduleChecker.cs` - Implement GetPendingTasksAsync to check Mass/Daily/Hourly schedules - Priority order: Mass > Daily > Hourly - Check both IsEnabled and specific schedule Enabled flags - Calculate MinimumDT with lookback multiplier (Daily timestamp for Hourly) - Validation: Unit tests for schedule checking logic pass - [x] Create SyncOrchestrator service - File: `Services/SyncOrchestrator.cs` - Implement ExecutePendingSyncsAsync using Parallel.ForEachAsync - Create IServiceScope per parallel operation - Pass CancellationToken to all operations - Validation: Multiple syncs run in parallel up to MaxDegreeOfParallelism - [x] Create DataUpdateTask model - File: `Models/DataUpdateTask.cs` - Properties: TableName, UpdateType, SourceSystem, MinimumDT, OperationId, Config - Validation: Model used by ScheduleChecker and SyncOrchestrator ## Phase 3: Table Sync Operations - [x] Create TableSyncOperation service - File: `Services/TableSyncOperation.cs` - Implement ExecuteAsync for single table sync - Create DataUpdate record at start (NumberRecords = -2) - Resolve IDataFetcher and execute FetchAsync - Batch records and delegate to StagingTableManager - Update DataUpdate record on success/failure - Use ILogger.BeginScope for structured logging - Validation: Single table sync executes end-to-end - [x] Create StagingTableManager service - File: `Services/StagingTableManager.cs` - Create staging tables with unique suffix: `#Staging{Table}_{OperationId}` - Implement bulk copy with BulkCopyBatchSize - Implement deduplication to temp table with ROW_NUMBER - Generate and execute MERGE statement - Handle tables with/without LastUpdateDT column - Clean up staging and temp tables - Validation: MERGE correctly inserts new and updates existing records - [x] Implement mass update with truncation - In StagingTableManager or separate method - Disable non-PK indexes before truncate - TRUNCATE destination table when PrepurgeData = true - Bulk copy directly to destination - Rebuild indexes if ReIndexData = true - Validation: Mass update truncates and reloads table - [x] Implement batching for large datasets - In TableSyncOperation - Process records in batches of BatchSize (1,000,000) - Each batch creates fresh staging/temp tables with unique suffix - Accumulate total record count across batches - Validation: Large dataset processes in multiple batches ## Phase 4: Data Fetcher Implementations - [x] Create mock/test fetcher base class - File: `Fetchers/MockDataFetcher.cs` - Returns sample data for testing without JDE/CMS connectivity - Validation: Tests can run without external databases - [x] Create JDE fetcher implementations (stubs) - Files: `Fetchers/Jde/JdeWorkOrderFetcher.cs`, `JdeLotUsageFetcher.cs`, `JdeItemFetcher.cs`, etc. - Implement IDataFetcher interface - Initially delegate to mock or throw NotImplementedException - Validation: All fetchers register in DI and resolve correctly - [x] Create CMS fetcher implementation (stub) - File: `Fetchers/Cms/CmsMisDataFetcher.cs` - Implement IDataFetcher - Initially delegate to mock or throw NotImplementedException - Validation: CMS fetcher registers in DI and resolves correctly ## Phase 5: Update Logging and Recovery - [x] Implement update logging repository methods - In existing repository or new DataUpdateRepository - StartUpdateAsync: Insert DataUpdate with NumberRecords = -2 - CompleteUpdateAsync: Update EndDT, WasSuccessful, NumberRecords - GetLastDataUpdatesAsync: Query LastDataUpdates view - Validation: DataUpdate records created and updated correctly - [x] Implement CloseOpenUpdateEntries - Method in DataSyncService or repository - Update all records where NumberRecords = -2 to failed state - Called at service startup - Validation: Interrupted syncs marked as failed on restart - [x] Implement PurgeUpdateEntries - Method in DataSyncService or repository - Delete DataUpdate records older than PurgeRetentionDays - Called periodically (e.g., daily) - Validation: Old records purged correctly ## Phase 6: Health Checks and Telemetry - [x] Create DataSyncHealthCheck - File: `HealthChecks/DataSyncHealthCheck.cs` - Implement IHealthCheck interface - Return Healthy when all tables synced within interval - Return Degraded when tables overdue but syncs progressing - Return Unhealthy when repeated failures - Include per-table status in response data - Validation: Health endpoint returns correct status - [x] Create DataSyncMetrics - File: `Telemetry/DataSyncMetrics.cs` - Create Meter named "DataSync" - Counters: sync.operations.started, completed, failed - Histograms: sync.duration.seconds, sync.records.processed - Include table name and update type as tags - Validation: Metrics emitted during sync operations - [x] Create DataSyncActivitySource - File: `Telemetry/DataSyncActivitySource.cs` - Create ActivitySource named "DataSync" - Start activity for each sync operation with table/type tags - Complete activity with record count on success - Set error status on failure - Validation: Activities visible in distributed tracing ## Phase 7: DI Registration - [x] Create AddDataSync extension method - File: `DependencyInjection/ServiceCollectionExtensions.cs` - Configure DataSyncOptions from configuration - Register DataSyncService as hosted service - Register all scoped services (orchestrator, checker, operation, staging) - Register health check - Register metrics singleton - Register all fetcher implementations - Add options validation - Validation: All services resolve correctly at startup - [x] Update JdeScoping.Host Program.cs - Add `builder.Services.AddDataSync(builder.Configuration)` - Validation: Host starts with data sync service running - [x] Add DataSync configuration to appsettings.json - Add DataSync section with options and data sources - Include all table configurations from spec - Validation: Configuration loads correctly ## Phase 8: Testing - [x] Write unit tests for ScheduleChecker - Test Mass/Daily/Hourly priority - Test MinimumDT calculation with lookback - Test disabled table handling - Test first sync (no prior updates) scenario - Validation: All schedule logic tests pass - [x] Write unit tests for StagingTableManager - Test staging table creation with unique suffix - Test MERGE with/without LastUpdateDT column - Test mass update truncation path - Validation: All staging/merge logic tests pass - [x] Write integration tests for DataSyncService - Test service startup and shutdown - Test CloseOpenUpdateEntries at startup - Test parallel sync execution - Test cancellation handling - Validation: Integration tests pass with test database ## Phase 9: Validation - [x] Run openspec validate - Command: `openspec validate implement-data-sync --strict` - Fix any validation errors - Validation: Validation passes - [x] Verify all acceptance criteria met - DataSyncService starts and stops gracefully - Schedules checked and tasks queued correctly - Parallel execution works with proper isolation - DataUpdate logging complete - Health check reports correct status - Metrics emitted correctly