Reputation: 3157

remove first occurring paragraph tag contents in string

How can i remove the first occurring paragraph tag contents in string.

Actual String
<p>Hello</p> <p>World</p>

Result
<p>World</p>

One option is to find the position of first  and first  and then replace everything with "" to position 

How can this be Achieved with regex?

Upvotes: 1

Answers (3)

hwnd

Reputation: 70750

Use the Regex.Replace method defining the count (times the replacement can occur) to 1

Regex rgx     = new Regex(@"<p>.*?</p>*");
String input  = @"<p>Hello</p> <p>World</p>";
String result = rgx.Replace(input, "", 1);

Upvotes: 1

zx81

Reputation: 41848

Apart from the warnings about using regex to parse html...

A. If First Paragraph Always Starts at the Beginning of the String

Search: ^.*?
Replace: empty string
The ^ anchor asserts that we are at the beginning of the string.
The lazy .*? ensures that we only match up to the first closing

In C#:

string resultString = Regex.Replace(yourstring, "^<p>.*?</p>", "");

B. If First Paragraph Can Start Anywhere

Search: (?s)(\A.*?).*?
Replace: in the delegate function, return Group 1.
(?s) allows the dot to match newlines in case your first paragraph occurs after the first line
In the (\A.*?) the \A asserts that we are at the beginning of the string, then the lazy .*? matches everything up to the first paragraph. This is all captured to Group 1.
.*? matches the paragraph
The replacement is Group 1, so the paragraph is deleted.

Here is a full C# program to show how this works (see output at the bottom of the online demo).

using System;
using System.Text.RegularExpressions;
class Program
{
static void Main() {
var myRegex = new Regex(@"(?s)(\A.*?)<p>.*?</p>");
string s1 = @"Hey! <p>Hello</p> <p>World</p>";

string replaced = myRegex.Replace(s1, delegate(Match m) {
return m.Groups[1].Value;
});
Console.WriteLine(replaced);

} // END Main
} // END Program

Upvotes: 0

fantastik78

Reputation: 735

You could capture group in the string like this:

string input = @"<p>Hello</p> <p>World</p>";
string pattern = @"<p>(\w*)</p>";
MatchCollection matches = Regex.Matches(input, pattern);
// matches[0] contains <p>Hello</p>
// matches[1] contains <p>World</p>

Upvotes: 0

remove first occurring paragraph tag contents in string

Answers (3)

Related Questions