Reputation: 12096
I'm using preg_match
to try and capture the 'Data' in this html structure but currently it's not returning anything, I think this may be down to the whitespace?
Just wondering what's wrong in the preg_match
?
html
<td><strong>Title</strong></td>
<td>Data</td>
php
preg_match("~<td><strong>Title</strong></td>
<td>([a-zA-Z0-9 -_]+)</td>~", $html, $match);
Upvotes: 2
Views: 11545
Reputation: 426
Use s modifier
Read more about modifires Modifiers
preg_match_all('/<td><strong>Title<\/strong><\/td>.*<td>(.*)<\/td>/iUs',$cnt,$preg);
print_r($preg);
Output:
Array
(
[0] => Array
(
[0] => <td><strong>Title</strong></td>
<td>Data</td>
)
[1] => Array
(
[0] => Data
)
)
Upvotes: 0
Reputation: 3591
Sorry, did not test before. \s* gives you 0 to infinity possible spaces, so it is your solution here.
preg_match("/<td><strong>Title<\/strong><\/td>\s*<td>([a-zA-Z0-9 -_]+)<\/td>/",
$html, $match)
Tested it out. It works now :)
Upvotes: 1
Reputation: 324750
Instead of trying to reproduce the exact sequence of whitespace (which may be hard or even impossible due to line endings), just use \s*
to indicate "any number (including zero) of whitespace characters" - this includes spaces, tabs, newlines, carriage returns... exactly what you need here.
Upvotes: 5
Reputation: 7739
If you want to get data from an html file, an xml parser can be a lot better.
Anyway, your regular expression won't match anything in more than one line unless you specify the modifier m (you can also specify the modifier s for the dot (.) to match new lines too ).
See http://php.net/manual/en/reference.pcre.pattern.modifiers.php
Upvotes: 0