Reputation: 21
I am trying to build a crawler that gets the movie urls from an imdb list. I am able to get all the links on the page into an array and want to select only those ones with "title" in them.
preg_match_all($pattern, "[125] => href=\"/chart/2000s?mode=popular\" [126] => href=\"/title/tt0111161/\" ", $matches);
where $pattern='/title/'
.
I am getting the following error:
Warning: preg_match_all() [function.preg-match-all]: Delimiter must not be alphanumeric or backslash in C:\xampp\htdocs\phpProject1\index.php on line 53
Any idea on how to go about this? Thanks a lot.
Upvotes: 2
Views: 1263
Reputation: 455132
Are you sure $pattern
is '/title/'
at the time when preg_match_all is called?
The error you are getting comes when the pattern provided to preg_match_all (1st argument) is not properly delimited.
Upvotes: 1
Reputation: 342655
Use a DOM Parser:
// Create DOM from URL or file
$html = file_get_html('http://www.example.com/');
// Find all links containing title as part of their HREF
$links = $html->find('a[href*=title]');
// loop through links and do stuff
foreach($links as $link) {
echo $element->href . '<br>';
}
http://simplehtmldom.sourceforge.net/manual.htm
Upvotes: 1