user1703991
user1703991

Reputation: 11

How to parse images interspersed with text in a HTML DOMDocument?

I'm trying to parse a field that has images in place of some letters and numbers. It could be the starting letter of a paragraph, a fancy image of that letter, or it could be a letter or number has an image replacement in the middle of the text. For example, the phrase

"Four scores and 7 years ago"

<img src=/img/F.png>our scores and <img src=/img/7.png"> years ago

With images replacing some of the letters and numbers.

I'm correctly able to parse the Letter or Number that I want to replace the image with in the text field, but I don't quite understand how I'm supposed to go about it. This is based off of an example in the PHP docs:

if ( ! strcmp('Text Field', $label)) {
   $img_tags = $divs->item($i + 1)->getElementsByTagName('img');
   $num_images = $img_tags->length;

   for ($img = 0; $img < $num_images; $img++) {
       if ($img_tags->item($img)->hasAttributes()) {
           $img_tag = $img_tags->item($img)->getAttribute('src');
           if (preg_match('/name=([a-zA-Z0-9])/', $img_tag, $matches)) {
               // XXX So here I have $matches[1] which contains the letter/number I want inserted into the parent node in the exact place of the <img> tag
               $replacement = $page->createTextNode($matches[1]);
               $img_tags->item($img)->parentNode->replaceChild($replacement, $img_tags->item($img));
           }
       }
   }
}

Extended example:

Lets say I hit a line as such:

<div class="label">Title</div> 

I know the next field will be a text field

<div class="value">
  <img src=/img/F.png>our scores and <img src=/img/7.png"> years ago
</div> 

I'm trying to grab the paragraph and turn the images into the letters that I parse from the image names.

Upvotes: 1

Views: 280

Answers (1)

user1655801
user1655801

Reputation:

may be using of str_replace is the better way.

$source = "<img src=/img/F.pNg>our scores and <img src=/img/7.png\"> years ago";

preg_match_all("/<.*?[\=\/]([^\/]*?)\.(?:png|jpeg).*?>/i", $source, $images);

$keys = array();
$replacements = array();

foreach($images[0] as $index => $image)
{
    $keys[] = $image;
    $replacements[] = $images[1][$index];
}

$result = str_replace($keys, $replacements, $source);

// Returns 'Four scores and 7 years ago'
print($result . PHP_EOL);

Upvotes: 1

Related Questions