Anjali5
Anjali5

Reputation: 683

extract a string from

I have the following string

29  This is a Page1  6754001  1,2,3,4
6755  This is a Page2 56-0 7654564 
 This is a Page3  67543-986xx 8 12
 This is (Page5)& Container 876-0 6 8xp

From the above, I need to extract the below text

This is a Page1 
 This is a Page2
 This is a Page3
 This is (Page5)& Container

There is always a space between the first number and the text so there is a space between 2129 and This is page1. Sometimes the first number is omitted like 2129 is gone. There is always a space between the text and the next number so there is a space between This is a Page1 and 6754001 and sometimes there can be two spaces. I just need to extract these lines These line always start after space so it can be

29 This is page1

and they are always succeeded by a space, sometimes one space and sometimes two spaces.

any help will be appreciated.

Upvotes: 0

Views: 80

Answers (2)

vks
vks

Reputation: 67968

^\d*.*?\s+|(?<=\s)\d{2,}.*(?=\s|$)

Try this.This will work with your latest requriement.See demo

http://regex101.com/r/gG5fF6/4

Upvotes: 0

Avinash Raj
Avinash Raj

Reputation: 174696

You could try the below regex to get the text which is preceded by an optional number at the start and followed by one or more spaces and a digit.

Regex:

^(?:\d+)?\s*(.*?)\s+\d.*

Replacement string:

$1

DEMO

Through string replacement,

Code:

string str = @"29  This is a Page1  6754001  1,2,3,4
6755  This is a Page2 56-0 7654564 
 This is a Page3  67543-986xx 8 12
 This is (Page5)& Container 876-0 6 8xp";
string result = Regex.Replace(str, @"(?m)^(?:\d+)?\s*(.*?)\s+\d.*", "$1");
Console.WriteLine(result);
Console.ReadLine();

Output:

This is a Page1
This is a Page2
This is a Page3
This is (Page5)& Container

IDEONE

OR

Through Matches method.

string str = @"29  This is a Page1  6754001  1,2,3,4
6755  This is a Page2 56-0 7654564 
 This is a Page3  67543-986xx 8 12
 This is (Page5)& Container 876-0 6 8xp";
Regex rgx = new Regex(@"(?m)^(?:\d+)?\s*(.*?)\s+\d.*");
foreach (Match m in rgx.Matches(str))
Console.WriteLine(m.Groups[1].Value);

IDEONE

Upvotes: 3

Related Questions