Ângelo Rigo
Ângelo Rigo

Reputation: 2159

Detect and extract image url from text and html tags

How can i detect if there is some image html tag inside a text and extract just the url of the image ?

Eg.

Extract this url :

http://
www.someurl.com/somefileprocessor.php/12345/somedir/somesubdir/someniceimage.j
pg

from this tag (this tag can be inside another bunch of text and/or html)

<img title="Some nice title" border="0"
hspace="0" alt="some useful hint" src="http://
www.someurl.com/somefileprocessor.php/12345/somedir/somesubdir/someniceimage.j
pg" width="629" height="464" />

Thank's in advance Ângelo

Upvotes: 0

Views: 3147

Answers (3)

Emissary
Emissary

Reputation: 10148

A quick attempt at an <img/> tag specific regex:

preg_match_all('/<img[^>]*?\s+src\s*=\s*"([^"]+)"[^>]*?>/i', $str, $matches);

Example

Upvotes: 2

&#194;ngelo Rigo
&#194;ngelo Rigo

Reputation: 2159

Thank's a lot for the awnswers, as i learn some more PHP. I try this quick and dirty way, it also extracts the image url

$imageurl    = strstr($title, 'src',FALSE);
$imageurl    = strstr($imageurl,'"',FALSE);
$imageurlpos = strpos($imageurl,'"');
$imageurl    = substr($imageurl,$imageurlpos+1);
$imageurlpos = strpos($imageurl,'"');
$imageurl    = substr($imageurl,0,$imageurlpos);

Upvotes: 0

Moein Hosseini
Moein Hosseini

Reputation: 4373

You can use CRUL to get content and then extract all img tags from content. to get data by curl:

function get_data($url) {
    $ch = curl_init();
    $timeout = 5;
    curl_setopt($ch, CURLOPT_URL, $url);
    curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
    curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, $timeout);
    $data = curl_exec($ch);
    curl_close($ch);
    return $data;
}

then use regular expression to extract data.

^https?://(?:[a-z\-]+\.)+[a-z]{2,6}(?:/[^/#?]+)+\.(?:jpg|gif|png)$

this helps you to extract all image urls(in img tag or not).

If you need crawler ,you can use PHPCrawl

Upvotes: 1

Related Questions