# Regex Transformer Implementation Plan > **For Claude:** REQUIRED SUB-SKILL: Use superpowers:executing-plans to implement this plan task-by-task. **Goal:** Add a RegexTransformer to the DataSync ETL pipeline that transforms string column values using regex, with a custom ConfigManager editor featuring live test/preview. **Architecture:** The transformer extends `DataTransformerBase` and overrides `GetValue()` to apply regex transformations. Supports two modes: Find & Replace (uses `Regex.Replace`) and Match & Extract (extracts first capture group). The ConfigManager gets a new `RegexTransformerViewModel` and `RegexEditorView` with integrated pattern testing. **Tech Stack:** .NET 10, System.Text.RegularExpressions, Avalonia UI, xUnit + NSubstitute **Design Doc:** `PLANS/2025-01-22-regex-transformer-design.md` --- ## Task 1: Add NonMatchBehavior Enum to PipelineModel **Files:** - Modify: `NEW/src/Utils/JdeScoping.ConfigManager/Models/PipelineModel.cs:181-215` **Step 1: Add the enum and new properties** Add after line 215 (after `TransformerModel` class closing brace), then add properties to `TransformerModel`: ```csharp // Add this using at top of file: using System.Text.Json.Serialization; // Add this enum after TransformerModel class (line ~216): /// /// Specifies behavior when a regex pattern does not match the input value. /// [JsonConverter(typeof(JsonStringEnumConverter))] public enum NonMatchBehavior { /// Keep the original value unchanged. KeepOriginal, /// Return null/DBNull. ReturnNull, /// Return an empty string. ReturnEmpty } // Add these properties inside TransformerModel class (after OutputColumn property, ~line 214): /// /// Gets or sets the column name for Regex transformer. /// public string? ColumnName { get; set; } /// /// Gets or sets the regex pattern for Regex transformer. /// public string? Pattern { get; set; } /// /// Gets or sets the replacement string for Regex transformer (null = Match & Extract mode). /// public string? Replacement { get; set; } /// /// Gets or sets whether regex matching is case-insensitive. /// public bool IgnoreCase { get; set; } /// /// Gets or sets the behavior when regex pattern does not match. /// public NonMatchBehavior NonMatchBehavior { get; set; } = NonMatchBehavior.KeepOriginal; ``` **Step 2: Verify it compiles** Run: `dotnet build NEW/src/Utils/JdeScoping.ConfigManager/JdeScoping.ConfigManager.csproj` Expected: Build succeeded **Step 3: Commit** ```bash git add NEW/src/Utils/JdeScoping.ConfigManager/Models/PipelineModel.cs git commit -m "$(cat <<'EOF' feat(configmanager): add NonMatchBehavior enum and regex properties to TransformerModel Add configuration model support for the new Regex transformer including: - NonMatchBehavior enum with JSON string serialization - ColumnName, Pattern, Replacement, IgnoreCase, NonMatchBehavior properties EOF )" ``` --- ## Task 2: Create RegexTransformer with First Test (Find & Replace) **Files:** - Create: `NEW/src/JdeScoping.DataSync/Etl/Transformers/RegexTransformer.cs` - Create: `NEW/tests/JdeScoping.DataSync.Tests/Etl/Transformers/RegexTransformerTests.cs` **Step 1: Write the first failing test** Create test file: ```csharp using System.Data; using JdeScoping.DataSync.Etl.Transformers; using NSubstitute; namespace JdeScoping.DataSync.Tests.Etl.Transformers; public class RegexTransformerTests { [Fact] public void FindReplace_RemovesPrefix() { // Arrange var source = CreateMockReader( columns: new[] { "BatchID", "Name" }, values: new object[] { "IIS_12345", "Test" }); var transformer = new RegexTransformer( columnName: "BatchID", pattern: "^IIS_", replacement: ""); // Act var reader = transformer.Transform(source); source.Read().Returns(true); reader.Read(); // Assert Assert.Equal("12345", reader.GetValue(0)); Assert.Equal("Test", reader.GetValue(1)); // Other column unchanged } private static IDataReader CreateMockReader(string[] columns, object[] values) { var reader = Substitute.For(); reader.FieldCount.Returns(columns.Length); for (int i = 0; i < columns.Length; i++) { var index = i; reader.GetName(index).Returns(columns[index]); reader.GetOrdinal(columns[index]).Returns(index); reader.GetFieldType(index).Returns(values[index]?.GetType() ?? typeof(object)); reader.GetValue(index).Returns(values[index]); reader.IsDBNull(index).Returns(values[index] == null || values[index] == DBNull.Value); } return reader; } } ``` **Step 2: Run test to verify it fails** Run: `dotnet test NEW/tests/JdeScoping.DataSync.Tests/JdeScoping.DataSync.Tests.csproj --filter "FullyQualifiedName~RegexTransformerTests.FindReplace_RemovesPrefix" -v n` Expected: FAIL - "The type or namespace name 'RegexTransformer' could not be found" **Step 3: Write minimal implementation** Create `NEW/src/JdeScoping.DataSync/Etl/Transformers/RegexTransformer.cs`: ```csharp using System.Data; using System.Text.RegularExpressions; namespace JdeScoping.DataSync.Etl.Transformers; /// /// Specifies behavior when a regex pattern does not match the input value. /// public enum NonMatchBehavior { /// Keep the original value unchanged. KeepOriginal, /// Return null/DBNull. ReturnNull, /// Return an empty string. ReturnEmpty } /// /// A data transformer that applies regex transformations to string values in a column. /// Supports two modes: Find & Replace (when replacement is provided) and Match & Extract /// (when replacement is null, extracts first capture group). /// public class RegexTransformer : DataTransformerBase { private readonly string _columnName; private readonly string _pattern; private readonly string? _replacement; private readonly bool _ignoreCase; private readonly NonMatchBehavior _nonMatchBehavior; private Regex? _regex; private int _columnOrdinal = -1; /// public override string TransformerName => $"Regex:{_columnName}"; /// /// Creates a new RegexTransformer. /// /// The column to transform. /// The regex pattern. /// Replacement string for Find & Replace mode, or null for Match & Extract mode. /// Whether to use case-insensitive matching. /// Behavior when pattern does not match. public RegexTransformer( string columnName, string pattern, string? replacement = null, bool ignoreCase = false, NonMatchBehavior nonMatchBehavior = NonMatchBehavior.KeepOriginal) { ArgumentException.ThrowIfNullOrWhiteSpace(columnName); ArgumentException.ThrowIfNullOrWhiteSpace(pattern); _columnName = columnName; _pattern = pattern; _replacement = replacement; _ignoreCase = ignoreCase; _nonMatchBehavior = nonMatchBehavior; } /// protected override void OnInitialize(IDataReader source) { _columnOrdinal = source.GetOrdinal(_columnName); var options = RegexOptions.Compiled; if (_ignoreCase) options |= RegexOptions.IgnoreCase; _regex = new Regex(_pattern, options); } /// public override object GetValue(int ordinal, IDataReader source) { var value = source.GetValue(ordinal); // Only transform the target column if (ordinal != _columnOrdinal) return value; // Pass through null/DBNull if (value == null || value == DBNull.Value) return value; var stringValue = value.ToString() ?? string.Empty; // Find & Replace mode (replacement is not null) if (_replacement != null) { return _regex!.Replace(stringValue, _replacement); } // Match & Extract mode (replacement is null) var match = _regex!.Match(stringValue); if (match.Success && match.Groups.Count > 1) { return match.Groups[1].Value; } // No match - apply NonMatchBehavior return _nonMatchBehavior switch { NonMatchBehavior.ReturnNull => DBNull.Value, NonMatchBehavior.ReturnEmpty => string.Empty, _ => value // KeepOriginal }; } /// public override Type GetFieldType(int ordinal, IDataReader source) { // Target column always returns string if (ordinal == _columnOrdinal) return typeof(string); return source.GetFieldType(ordinal); } } ``` **Step 4: Run test to verify it passes** Run: `dotnet test NEW/tests/JdeScoping.DataSync.Tests/JdeScoping.DataSync.Tests.csproj --filter "FullyQualifiedName~RegexTransformerTests.FindReplace_RemovesPrefix" -v n` Expected: PASS **Step 5: Commit** ```bash git add NEW/src/JdeScoping.DataSync/Etl/Transformers/RegexTransformer.cs NEW/tests/JdeScoping.DataSync.Tests/Etl/Transformers/RegexTransformerTests.cs git commit -m "$(cat <<'EOF' feat(datasync): add RegexTransformer with Find & Replace mode Initial implementation supporting: - Find & Replace mode with regex pattern and replacement string - Case-insensitive option - NonMatchBehavior enum for handling non-matches EOF )" ``` --- ## Task 3: Add Match & Extract Mode Tests **Files:** - Modify: `NEW/tests/JdeScoping.DataSync.Tests/Etl/Transformers/RegexTransformerTests.cs` **Step 1: Write the failing test for Match & Extract** Add to test class: ```csharp [Fact] public void MatchExtract_ExtractsFirstCaptureGroup() { // Arrange var source = CreateMockReader( columns: new[] { "Code" }, values: new object[] { "ID_12345" }); var transformer = new RegexTransformer( columnName: "Code", pattern: @"ID_(\d+)", replacement: null); // null = Match & Extract mode // Act var reader = transformer.Transform(source); source.Read().Returns(true); reader.Read(); // Assert Assert.Equal("12345", reader.GetValue(0)); } [Fact] public void MatchExtract_NoMatch_KeepOriginal() { // Arrange var source = CreateMockReader( columns: new[] { "Code" }, values: new object[] { "UNKNOWN" }); var transformer = new RegexTransformer( columnName: "Code", pattern: @"ID_(\d+)", replacement: null, nonMatchBehavior: NonMatchBehavior.KeepOriginal); // Act var reader = transformer.Transform(source); source.Read().Returns(true); reader.Read(); // Assert Assert.Equal("UNKNOWN", reader.GetValue(0)); } [Fact] public void MatchExtract_NoMatch_ReturnNull() { // Arrange var source = CreateMockReader( columns: new[] { "Code" }, values: new object[] { "UNKNOWN" }); var transformer = new RegexTransformer( columnName: "Code", pattern: @"ID_(\d+)", replacement: null, nonMatchBehavior: NonMatchBehavior.ReturnNull); // Act var reader = transformer.Transform(source); source.Read().Returns(true); reader.Read(); // Assert Assert.Equal(DBNull.Value, reader.GetValue(0)); } [Fact] public void MatchExtract_NoMatch_ReturnEmpty() { // Arrange var source = CreateMockReader( columns: new[] { "Code" }, values: new object[] { "UNKNOWN" }); var transformer = new RegexTransformer( columnName: "Code", pattern: @"ID_(\d+)", replacement: null, nonMatchBehavior: NonMatchBehavior.ReturnEmpty); // Act var reader = transformer.Transform(source); source.Read().Returns(true); reader.Read(); // Assert Assert.Equal(string.Empty, reader.GetValue(0)); } ``` **Step 2: Run tests to verify they pass** Run: `dotnet test NEW/tests/JdeScoping.DataSync.Tests/JdeScoping.DataSync.Tests.csproj --filter "FullyQualifiedName~RegexTransformerTests" -v n` Expected: All 5 tests PASS **Step 3: Commit** ```bash git add NEW/tests/JdeScoping.DataSync.Tests/Etl/Transformers/RegexTransformerTests.cs git commit -m "$(cat <<'EOF' test(datasync): add Match & Extract mode tests for RegexTransformer Tests cover: - Extracting first capture group - NonMatchBehavior: KeepOriginal, ReturnNull, ReturnEmpty EOF )" ``` --- ## Task 4: Add Edge Case Tests **Files:** - Modify: `NEW/tests/JdeScoping.DataSync.Tests/Etl/Transformers/RegexTransformerTests.cs` **Step 1: Add edge case tests** Add to test class: ```csharp [Fact] public void FindReplace_UseCaptureGroups() { // Arrange - swap two numbers var source = CreateMockReader( columns: new[] { "Value" }, values: new object[] { "123-456" }); var transformer = new RegexTransformer( columnName: "Value", pattern: @"(\d+)-(\d+)", replacement: "$2-$1"); // Act var reader = transformer.Transform(source); source.Read().Returns(true); reader.Read(); // Assert Assert.Equal("456-123", reader.GetValue(0)); } [Fact] public void IgnoreCase_MatchesDifferentCase() { // Arrange var source = CreateMockReader( columns: new[] { "BatchID" }, values: new object[] { "IIS_12345" }); var transformer = new RegexTransformer( columnName: "BatchID", pattern: "^iis_", // lowercase pattern replacement: "", ignoreCase: true); // Act var reader = transformer.Transform(source); source.Read().Returns(true); reader.Read(); // Assert Assert.Equal("12345", reader.GetValue(0)); } [Fact] public void NullValue_PassesThrough() { // Arrange var source = CreateMockReader( columns: new[] { "BatchID" }, values: new object[] { DBNull.Value }); var transformer = new RegexTransformer( columnName: "BatchID", pattern: "^IIS_", replacement: ""); // Act var reader = transformer.Transform(source); source.Read().Returns(true); reader.Read(); // Assert Assert.Equal(DBNull.Value, reader.GetValue(0)); } [Fact] public void NonTargetColumn_Unchanged() { // Arrange var source = CreateMockReader( columns: new[] { "BatchID", "OtherColumn" }, values: new object[] { "IIS_12345", "IIS_Should_Not_Change" }); var transformer = new RegexTransformer( columnName: "BatchID", pattern: "^IIS_", replacement: ""); // Act var reader = transformer.Transform(source); source.Read().Returns(true); reader.Read(); // Assert Assert.Equal("12345", reader.GetValue(0)); Assert.Equal("IIS_Should_Not_Change", reader.GetValue(1)); } [Fact] public void InvalidRegex_ThrowsOnTransform() { // Arrange var source = CreateMockReader( columns: new[] { "Value" }, values: new object[] { "test" }); var transformer = new RegexTransformer( columnName: "Value", pattern: "[invalid(regex", replacement: ""); // Act & Assert var ex = Assert.Throws(() => transformer.Transform(source)); Assert.Contains("Invalid", ex.Message, StringComparison.OrdinalIgnoreCase); } [Fact] public void ColumnNotFound_ThrowsOnTransform() { // Arrange var source = CreateMockReader( columns: new[] { "Value" }, values: new object[] { "test" }); source.GetOrdinal("NonExistent").Returns(_ => throw new IndexOutOfRangeException("Column not found")); var transformer = new RegexTransformer( columnName: "NonExistent", pattern: "test", replacement: ""); // Act & Assert Assert.Throws(() => transformer.Transform(source)); } ``` **Step 2: Run all transformer tests** Run: `dotnet test NEW/tests/JdeScoping.DataSync.Tests/JdeScoping.DataSync.Tests.csproj --filter "FullyQualifiedName~RegexTransformerTests" -v n` Expected: All 11 tests PASS **Step 3: Commit** ```bash git add NEW/tests/JdeScoping.DataSync.Tests/Etl/Transformers/RegexTransformerTests.cs git commit -m "$(cat <<'EOF' test(datasync): add edge case tests for RegexTransformer Tests cover: - Capture group substitution in replacement - Case-insensitive matching - Null/DBNull passthrough - Non-target columns unchanged - Invalid regex pattern handling - Column not found handling EOF )" ``` --- ## Task 5: Create RegexTransformerViewModel **Files:** - Modify: `NEW/src/Utils/JdeScoping.ConfigManager/ViewModels/PipelineSteps/TransformerStepViewModels.cs` **Step 1: Add the ViewModel class** Add after `JdeDateTransformerViewModel` class (before `TransformerFactory`): ```csharp /// /// View model for Regex transformer. /// public class RegexTransformerViewModel : TransformerStepViewModelBase { private string _columnName = string.Empty; private string _pattern = string.Empty; private string? _replacement = string.Empty; private bool _isFindReplaceMode = true; private bool _ignoreCase; private NonMatchBehavior _nonMatchBehavior = NonMatchBehavior.KeepOriginal; // Test feature fields private string _testInput = string.Empty; private string _testResultValue = string.Empty; private string _testResultLabel = string.Empty; private string _testResultIcon = string.Empty; private string _testResultBackground = string.Empty; private bool _hasTestResult; private bool _hasTestError; private string _testErrorMessage = string.Empty; public RegexTransformerViewModel(TransformerModel model, Action onChanged) : base(onChanged) { _columnName = model.ColumnName ?? string.Empty; _pattern = model.Pattern ?? string.Empty; _replacement = model.Replacement; _isFindReplaceMode = model.Replacement != null; _ignoreCase = model.IgnoreCase; _nonMatchBehavior = model.NonMatchBehavior; TestPatternCommand = new RelayCommand(ExecuteTestPattern); } public RegexTransformerViewModel(Action onChanged) : base(onChanged) { TestPatternCommand = new RelayCommand(ExecuteTestPattern); } public override string TransformerType => "Regex"; public override string DisplayName => "Regex Transform"; public override string Icon => "󰑑"; // mdi-regex public override string Summary => !string.IsNullOrEmpty(_columnName) ? $"{_columnName}: {(_isFindReplaceMode ? "Replace" : "Extract")}" : "Configure..."; /// Gets or sets the column name to transform. public string ColumnName { get => _columnName; set { if (SetProperty(ref _columnName, value ?? string.Empty)) { OnPropertyChanged(nameof(Summary)); NotifyChanged(); } } } /// Gets or sets the regex pattern. public string Pattern { get => _pattern; set { if (SetProperty(ref _pattern, value ?? string.Empty)) { ClearTestResult(); NotifyChanged(); } } } /// Gets or sets the replacement string (Find & Replace mode). public string? Replacement { get => _replacement; set { if (SetProperty(ref _replacement, value)) { ClearTestResult(); NotifyChanged(); } } } /// Gets or sets whether Find & Replace mode is active. public bool IsFindReplaceMode { get => _isFindReplaceMode; set { if (SetProperty(ref _isFindReplaceMode, value)) { OnPropertyChanged(nameof(IsMatchExtractMode)); OnPropertyChanged(nameof(PatternHelpText)); OnPropertyChanged(nameof(Summary)); ClearTestResult(); NotifyChanged(); } } } /// Gets or sets whether Match & Extract mode is active. public bool IsMatchExtractMode { get => !_isFindReplaceMode; set => IsFindReplaceMode = !value; } /// Gets the help text for the pattern field based on current mode. public string PatternHelpText => _isFindReplaceMode ? "Pattern to search for in the column value" : "Pattern with capture group - first group (parentheses) will be extracted"; /// Gets or sets whether matching is case-insensitive. public bool IgnoreCase { get => _ignoreCase; set { if (SetProperty(ref _ignoreCase, value)) { ClearTestResult(); NotifyChanged(); } } } /// Gets or sets the behavior when pattern doesn't match. public NonMatchBehavior NonMatchBehavior { get => _nonMatchBehavior; set { if (SetProperty(ref _nonMatchBehavior, value)) { ClearTestResult(); NotifyChanged(); } } } // Test feature properties public string TestInput { get => _testInput; set => SetProperty(ref _testInput, value ?? string.Empty); } public string TestResultValue { get => _testResultValue; set => SetProperty(ref _testResultValue, value); } public string TestResultLabel { get => _testResultLabel; set => SetProperty(ref _testResultLabel, value); } public string TestResultIcon { get => _testResultIcon; set => SetProperty(ref _testResultIcon, value); } public string TestResultBackground { get => _testResultBackground; set => SetProperty(ref _testResultBackground, value); } public bool HasTestResult { get => _hasTestResult; set => SetProperty(ref _hasTestResult, value); } public bool HasTestError { get => _hasTestError; set => SetProperty(ref _hasTestError, value); } public string TestErrorMessage { get => _testErrorMessage; set => SetProperty(ref _testErrorMessage, value); } public ICommand TestPatternCommand { get; } private void ExecuteTestPattern() { ClearTestResult(); if (string.IsNullOrEmpty(_pattern)) { HasTestError = true; TestErrorMessage = "Pattern is required"; return; } try { var options = _ignoreCase ? RegexOptions.IgnoreCase : RegexOptions.None; var regex = new Regex(_pattern, options); string result; bool matched; if (_isFindReplaceMode) { result = regex.Replace(_testInput, _replacement ?? string.Empty); matched = regex.IsMatch(_testInput); } else { var match = regex.Match(_testInput); if (match.Success && match.Groups.Count > 1) { result = match.Groups[1].Value; matched = true; } else { matched = false; result = _nonMatchBehavior switch { NonMatchBehavior.ReturnNull => "(null)", NonMatchBehavior.ReturnEmpty => "(empty)", _ => _testInput }; } } HasTestResult = true; TestResultValue = result; TestResultLabel = matched ? "Output" : "No Match"; TestResultIcon = matched ? "✓" : "—"; TestResultBackground = matched ? "#22C55E" : "#F59E0B"; } catch (RegexParseException ex) { HasTestError = true; TestErrorMessage = ex.Message; } } private void ClearTestResult() { HasTestResult = false; HasTestError = false; TestResultValue = string.Empty; TestResultLabel = string.Empty; TestErrorMessage = string.Empty; } public override TransformerModel ToModel() => new() { Type = TransformerType, ColumnName = _columnName, Pattern = _pattern, Replacement = _isFindReplaceMode ? _replacement : null, IgnoreCase = _ignoreCase, NonMatchBehavior = _nonMatchBehavior }; } ``` **Step 2: Add using statement** Add at top of file: ```csharp using System.Text.RegularExpressions; using JdeScoping.ConfigManager.Models; ``` **Step 3: Verify it compiles** Run: `dotnet build NEW/src/Utils/JdeScoping.ConfigManager/JdeScoping.ConfigManager.csproj` Expected: Build succeeded **Step 4: Commit** ```bash git add NEW/src/Utils/JdeScoping.ConfigManager/ViewModels/PipelineSteps/TransformerStepViewModels.cs git commit -m "$(cat <<'EOF' feat(configmanager): add RegexTransformerViewModel Implements ViewModel for Regex transformer editor with: - Column, Pattern, Replacement, IgnoreCase, NonMatchBehavior properties - Mode toggle between Find & Replace and Match & Extract - Live test/preview functionality with error handling EOF )" ``` --- ## Task 6: Update TransformerFactory **Files:** - Modify: `NEW/src/Utils/JdeScoping.ConfigManager/ViewModels/PipelineSteps/TransformerStepViewModels.cs:284-318` **Step 1: Update factory switch statements and AvailableTypes** In `TransformerFactory.Create()` method, add case: ```csharp "regex" => new RegexTransformerViewModel(model, onChanged), ``` In `TransformerFactory.CreateNew()` method, add case: ```csharp "regex" => new RegexTransformerViewModel(onChanged), ``` Update `AvailableTypes`: ```csharp public static IReadOnlyList AvailableTypes => ["ColumnDrop", "ColumnRename", "JdeDate", "Regex"]; ``` **Step 2: Verify it compiles** Run: `dotnet build NEW/src/Utils/JdeScoping.ConfigManager/JdeScoping.ConfigManager.csproj` Expected: Build succeeded **Step 3: Commit** ```bash git add NEW/src/Utils/JdeScoping.ConfigManager/ViewModels/PipelineSteps/TransformerStepViewModels.cs git commit -m "$(cat <<'EOF' feat(configmanager): register Regex transformer in TransformerFactory Add Regex to: - Create() factory method - CreateNew() factory method - AvailableTypes list EOF )" ``` --- ## Task 7: Create RegexEditorView XAML **Files:** - Create: `NEW/src/Utils/JdeScoping.ConfigManager/Views/Editors/RegexEditorView.axaml` - Create: `NEW/src/Utils/JdeScoping.ConfigManager/Views/Editors/RegexEditorView.axaml.cs` **Step 1: Create the XAML file** Create `NEW/src/Utils/JdeScoping.ConfigManager/Views/Editors/RegexEditorView.axaml`: ```xml KeepOriginal ReturnNull ReturnEmpty