Reputation: 683
I have the following string
29 This is a Page1 6754001 1,2,3,4
6755 This is a Page2 56-0 7654564
This is a Page3 67543-986xx 8 12
This is (Page5)& Container 876-0 6 8xp
From the above, I need to extract the below text
This is a Page1
This is a Page2
This is a Page3
This is (Page5)& Container
There is always a space between the first number and the text so there is a space between 2129 and This is page1. Sometimes the first number is omitted like 2129 is gone. There is always a space between the text and the next number so there is a space between This is a Page1 and 6754001 and sometimes there can be two spaces. I just need to extract these lines These line always start after space so it can be
29 This is page1
and they are always succeeded by a space, sometimes one space and sometimes two spaces.
any help will be appreciated.
Upvotes: 0
Views: 80
Reputation: 67968
^\d*.*?\s+|(?<=\s)\d{2,}.*(?=\s|$)
Try this.This will work with your latest requriement.See demo
http://regex101.com/r/gG5fF6/4
Upvotes: 0
Reputation: 174696
You could try the below regex to get the text which is preceded by an optional number at the start and followed by one or more spaces and a digit.
Regex:
^(?:\d+)?\s*(.*?)\s+\d.*
Replacement string:
$1
Through string replacement,
Code:
string str = @"29 This is a Page1 6754001 1,2,3,4
6755 This is a Page2 56-0 7654564
This is a Page3 67543-986xx 8 12
This is (Page5)& Container 876-0 6 8xp";
string result = Regex.Replace(str, @"(?m)^(?:\d+)?\s*(.*?)\s+\d.*", "$1");
Console.WriteLine(result);
Console.ReadLine();
Output:
This is a Page1
This is a Page2
This is a Page3
This is (Page5)& Container
OR
Through Matches
method.
string str = @"29 This is a Page1 6754001 1,2,3,4
6755 This is a Page2 56-0 7654564
This is a Page3 67543-986xx 8 12
This is (Page5)& Container 876-0 6 8xp";
Regex rgx = new Regex(@"(?m)^(?:\d+)?\s*(.*?)\s+\d.*");
foreach (Match m in rgx.Matches(str))
Console.WriteLine(m.Groups[1].Value);
Upvotes: 3