AhmedC
AhmedC

Reputation: 27

Removing first non-empty <p> tag from a row

I have a few thousands rows which contain HTML data

<p>non useful data</p>
<p>useful data</p>

I want to remove the first p tag

<p >non empty</p >

and update the row with only

<p>useful data</p>

I tried

$request = mysql_query("select * from content limit 50") or die('Error :'.$request.' '.mysql_error());
while($r=mysql_fetch_array($request)) {
    $id = $r[id];   
    $text = $r[text];   
    preg_match('@<p>.*?</p>(.*)@', $text, $matches);    
    $srcfinal = $matches[1];  
    $srcrestant =$matches[0];    
    echo"$srcfinal<br />";
  }

I can correctly extract the unuseful data ($srcfinal) but can't find anything to print the needed data (and then update the row with it).

Any lightweight code to use please ?

Upvotes: 2

Views: 246

Answers (2)

dognose
dognose

Reputation: 20899

Why don't you use a regular XML / HTML - Parser? There the removal of the first <p> tag is no big deal.

Assuming your result might not conaint valid HTML, you have various options to achieve your goal - be creative :)

And these are just the options without Regex.

Upvotes: 1

zx81
zx81

Reputation: 41838

It sounds like you'd like to retrieve the rows in PHP, then replace the first <p>...</p> tag with an empty string. This will do it (the \R* also removes any newline characters after the paragraph):

$replaced = preg_replace('%^<p>.*?</p>\R*%', '', $yourstring);

I would also consider doing it directly in the database with UPDATE... SET...:

  • LOCATE() will find your first </p>
  • RIGHT() will give you the characters to the right of that.

Upvotes: 1

Related Questions