DeathRox
DeathRox

Reputation: 3

PHP code to extract data from html page including tags

I've searched and tested for hours, ready to give up. I have a html page that will change every now and then, it's structure is this....

100 or so lines of HTML
<div class="the start of the info I want">
500 lines of HTML that I want to extract
<div class="end of the info I want">
more lines of HTML

This is my code that does not work, well one of many I've tried.

<?php
$data = file_get_contents('http://www.soemstupidsite.xyz');
$regex = '#<div class="the start of the info I want">(.*?)<div
class="end of the info I want">#';
preg_match($regex,$data,$match);
print_r($match);
echo $match[1];
?>

Returns the following error:
PHP Notice: Undefined offset: 1 in /home/www/mycrapcode.php on line 7

What the hell am I doing wrong?

Upvotes: 0

Views: 54

Answers (2)

Blaatpraat
Blaatpraat

Reputation: 2849

Please read something more about the regex modifiers/flags here.

The flag you need, is the s flag, so your selector would work on multiple lines.

Example with your code:

<?php
$data = file_get_contents('http://www.soemstupidsite.xyz');
$regex = '#<div class="the start of the info I want">(.*?)<div class="end of the info I want">#s';
preg_match($regex,$data,$match);
print_r($match);
echo $match[1];
?>

Also: the regex needs to be on 1 line, otherwise it won't work.

Upvotes: 0

Oleksandr  Maliuta
Oleksandr Maliuta

Reputation: 516

    $regex = '/<div class="the start of the info I want">(.*?)<div
class="end of the info I want">/s';

Upvotes: 1

Related Questions