Sandip Patel
Sandip Patel

Reputation: 267

Removing a portion of text using regex?

Here is what I have up to this point: The function .*? takes everything until the first "this character". For example $html = preg_replace('/alt=".*?"/', '', $html); replaces everything between alt=" and other quotation mark with nothing. My problem is now I have to deal with multiple characters. Here is the portion of text I want to replace :

<a href="http://feeds.feedburner.com/~ff/TheWindowsClub?a=tjWEu-9hLFk:Jv9oVdSsx2A:yIl2AUoC8zA"><img src="http://feeds.feedburner.com/~ff/TheWindowsClub?d=yIl2AUoC8zA" border="0"></img></a> <a href="http://feeds.feedburner.com/~ff/TheWindowsClub?a=tjWEu-9hLFk:Jv9oVdSsx2A:qj6IDK7rITs"><img src="http://feeds.feedburner.com/~ff/TheWindowsClub?d=qj6IDK7rITs" border="0"></img></a> <a href="http://feeds.feedburner.com/~ff/TheWindowsClub?a=tjWEu-9hLFk:Jv9oVdSsx2A:gIN9vFwOqvQ"><img src="http://feeds.feedburner.com/~ff/TheWindowsClub?i=tjWEu-9hLFk:Jv9oVdSsx2A:gIN9vFwOqvQ" border="0"></img></a> <a href="http://feeds.feedburner.com/~ff/TheWindowsClub?a=tjWEu-9hLFk:Jv9oVdSsx2A:I9og5sOYxJI"><img src="http://feeds.feedburner.com/~ff/TheWindowsClub?d=I9og5sOYxJI" border="0"></img></a> <a href="http://feeds.feedburner.com/~ff/TheWindowsClub?a=tjWEu-9hLFk:Jv9oVdSsx2A:cGdyc7Q-1BI"><img src="http://feeds.feedburner.com/~ff/TheWindowsClub?d=cGdyc7Q-1BI" border="0"></img></a></div><img src="http://feeds.feedburner.com/~r/TheWindowsClub/~4/tjWEu-9hLFk" height="1" width="1" alt=""/>

Unlike last time I can't use quotation marks or other such character. I have to delete the whole line. One thing I thought about was to do something like this:

$html = preg_replace('/<a href=".*?(alt=""/>)/', '', $html);

I thought that using the above code would find the last portion in this segment and replace everything inside but it replaces nothing. Please suggest what should I do?

After running above line of code the output should be nothing. It should remove all this code block.

Upvotes: 0

Views: 50

Answers (1)

Ivan Gabriele
Ivan Gabriele

Reputation: 6900

<a\s+href.*(alt="[^"]*")?>

or without quotation mark :

<a\s+href.*(alt="[^"]*"){0,1}>

We match everything that starts by <a, is followed by at least one space, then by any character until the character >, before which you may have zero or one iteration of the string alt="" containing anything but a ".

Upvotes: 1

Related Questions