Furquan Khan
Furquan Khan

Reputation: 1594

Extract certain substring using regex in c#

I have a word which has a section say 1.2.2 and a some text followed by some other texts. I want to get the section. I have created a regex to match the section and some text.

Below is my code:

var word = "1.2.3 area consent testing, sklfjsdlkf jdifgjds visjeflk area consent testing lsdajfgo idsjgosa jfikdjfl343 fjdsl45jl sfgjsoiaetj l area consent testing";
var lowerWord = "area consent testing".ToLower();
var textLower = @word.ToLower().ToString();
Dictionary<int, string> matchRegex = new Dictionary<int, string>();
matchRegex.Add(1, @"(^\d.+(?:\.\d+)*[ \t](" + lowerWord + "))"); 


foreach (var check in matchRegex)
{
    string AllowedChars = check.Value;
    Regex regex = new Regex(AllowedChars);
    var match = regex.Match(textLower);
    if (match.Success)
    {
        var sectionVal = match.Value;
    }
}

Now my problem is, I just want the value 1.2.3 area consent testing in my sectionVal variable, but it is giving me the whole line as it is. i.e.

sectionVal = "1.2.3 area consent testing, sklfjsdlkf jdifgjds visjeflk area consent testing lsdajfgo idsjgosa jfikdjfl343 fjdsl45jl sfgjsoiaetj l area consent testing";

Upvotes: 1

Views: 2010

Answers (1)

Titian Cernicova-Dragomir
Titian Cernicova-Dragomir

Reputation: 249636

The start of your regex contains an unescaped . which will match any character and a + after. Try this:

@"^(\d+(\.\d+)*[ \t](" + lowerWord + "))"

Upvotes: 2

Related Questions