How to find urls in images

Question

I am trying to extract urls from a large number of google search results. Getting them from the source code is proving to be quite challenging as the delimiters are not clear and not all of the urls are in the code. Is there a tool that can extract urls from a certain area of an image? If so that may be a better solution.

Any help would be much appreciated.

guillaumepotier · Accepted Answer

Use this excellent lib: http://simplehtmldom.sourceforge.net/manual.htm

// Grab the source code
$html = file_get_html('http://www.google.com/');

// Find all anchors, returns a array of element objects
$ret = $html->find('a');

// Get a attribute ( If the attribute is non-value attribute (eg. checked, selected...), it will returns true or false)
$value = $ret->href;

EDit :

All "natural" search urls are in the #res div it seems.. With simplehtmldom find first #res, than all url inside of it. Don't remember exactly the syntax but it must be this way :

$ret = $html->find('div[id=res]')->find('a');

or maybe

$html->find('div[id=res] a');

How to find urls in images

Answers (2)

Related Questions