kat
kat

Reputation: 217

Regex pattern for complex html div

Any ideas what might be wrong with this regex - it doesn't seem to find anything:

function ad_content($content) {
    if (is_single()) {
    $find = '#<div id=\"attachment_(\d+)\" class=\"wp-caption aligncenter\" style=\"width: (\d+)px\">(.*?)</div>#s';
    $replace1 = '11111';
    $content = preg_replace($find,$replace,$content,1);
    }
    return $content;
}
add_filter ('the_content','ad_content');

I've tried with something basic like

$find = '#attachment#';

and that does work.

When I use the above regex it doesn't replace anything, and gives no errors either. Thus, I suppose it just doesn't find anything. Here's what it should find:

<div id="attachment_167" class="wp-caption aligncenter" style="width: 600px"><a href="http://www.url.com"><img class="size-full wp-image-167" alt="text" src="http://www.url.com" width="600" height="776" /></a><p class="wp-caption-text">text &#8211; text</p></div>

I've tried it at this regex validator and it does match.

ANSWER:

I think I've finally figured it - the_content hook, doesn't seem to apply to my div. Simple as that.

Upvotes: 0

Views: 289

Answers (2)

dognose
dognose

Reputation: 20909

You should use a domparser to get the content of the "correct" div.

imagine there would be a "div" inside or the div iteself may be nested:

  <div> 
    Something else
      <div id="thisIwantToMatch"> Foo <div>Bar</div> Baz </div>
    Again something else
  </div>

Since the End-Tag does not contain attributes, it is hard - bareley impossible - to find the right one using regex. Using a "lazy" regex will match <div id="thisIwantToMatch"> Foo <div>Bar</div> while a greedy regex will match <div id="thisIwantToMatch"> Foo <div>Bar</div> Baz </div>Again something else</div>

Obviously both cases are not what you want.

Upvotes: 0

femtoRgon
femtoRgon

Reputation: 33351

Your regex looks correct to me, really.

When I change $replace1 to $replace, to agree with usage later in the function, and remove the if statement, it seems to work. That is:

function ad_content($content) {
    $find = '#<div id=\"attachment_(\d+)\" class=\"wp-caption aligncenter\" style=\"width: (\d+)px\">(.*?)</div>#s';
    $replace = '11111';
    $content = preg_replace($find,$replace,$content,1);
    return $content;
}

Seems to work as intended. I'm guessing that the $replace1 vs. $replace problem probably isn't in the code your executing (since you are detecting no errors), so are you sure that is_single() is returning true in the context in which you are testing this?

Upvotes: 1

Related Questions