Ahmad Fouad
Ahmad Fouad

Reputation: 4107

Using PHP to extract the alt and/or title attributes from images

I use this to extract the src of the image or the full path of image.

preg_match_all('/\< *[img][^\>]*src *= *[\"\']{0,1}([^\"\'\ >]*)/',$content,$matches);

It works for me so far, I get an array of all images sources. I am trying to be greedy and capture the alt and title values from the image tags.

I know it is not recommended to use regex to do it, but I really need a quick solution. I do not want it to return an error if alt or title is missing from the image tag.

Any input is appreciated and apologies. I know it is easier and appropriate with parser, but since I could get the src with that preg match i thought i could get the alt and title too! :)

Thanks a lot, happy new year :D

Upvotes: 0

Views: 2077

Answers (3)

Famver Tags
Famver Tags

Reputation: 1998

Try this, this is the best I could come up with in 3 minutes...

if(preg_match_all('@<img(\s?(src|alt|title)="([^"]+)"\s?)?(\s?(src|alt|title)="([^"]+)"\s?)?(\s?(src|alt|title)="([^"]+)"\s?)?\/?>@si',$content,$m)){
$img_array = array(
    $m[2][0]=>$m[3][0],
    $m[5][0]=>$m[6][0],
    $m[8][0]=>$m[9][0]
    );}

print_r($img_array);

Upvotes: 2

Maerlyn
Maerlyn

Reputation: 34107

Here's a solution using PHP's DOM parser:

$domd = new DOMDocument();
libxml_use_internal_errors(true);
$domd->loadHTML(file_get_contents("http://stackoverflow.com"));
libxml_use_internal_errors(false);

$items = $domd->getElementsByTagName("img");
$data = array();

foreach($items as $item) {
  $data[] = array(
    "src" => $item->getAttribute("src"),
    "alt" => $item->getAttribute("alt"),
    "title" => $item->getAttribute("title"),
  );
}

Upvotes: 2

hummingBird
hummingBird

Reputation: 2555

Use phpQuery, it does this easily.

http://code.google.com/p/phpquery/ (the good link)

Upvotes: 0

Related Questions