Dan
Dan

Reputation: 12096

Ignore whitespace when using preg_match

I'm using preg_match to try and capture the 'Data' in this html structure but currently it's not returning anything, I think this may be down to the whitespace?

Just wondering what's wrong in the preg_match?

html

  <td><strong>Title</strong></td>

                    <td>Data</td>

php

preg_match("~<td><strong>Title</strong></td>

                    <td>([a-zA-Z0-9 -_]+)</td>~", $html, $match);

Upvotes: 2

Views: 11545

Answers (4)

Obaid
Obaid

Reputation: 426

Use s modifier

Read more about modifires Modifiers

preg_match_all('/<td><strong>Title<\/strong><\/td>.*<td>(.*)<\/td>/iUs',$cnt,$preg);
print_r($preg);

Output:

Array
(
    [0] => Array
        (
            [0] => <td><strong>Title</strong></td>

                    <td>Data</td>
        )

    [1] => Array
        (
            [0] => Data
        )

)

Upvotes: 0

Dovydas Navickas
Dovydas Navickas

Reputation: 3591

Sorry, did not test before. \s* gives you 0 to infinity possible spaces, so it is your solution here.

preg_match("/<td><strong>Title<\/strong><\/td>\s*<td>([a-zA-Z0-9 -_]+)<\/td>/",
           $html, $match)

Tested it out. It works now :)

Upvotes: 1

Niet the Dark Absol
Niet the Dark Absol

Reputation: 324750

Instead of trying to reproduce the exact sequence of whitespace (which may be hard or even impossible due to line endings), just use \s* to indicate "any number (including zero) of whitespace characters" - this includes spaces, tabs, newlines, carriage returns... exactly what you need here.

Upvotes: 5

Oussama Jilal
Oussama Jilal

Reputation: 7739

If you want to get data from an html file, an xml parser can be a lot better.

Anyway, your regular expression won't match anything in more than one line unless you specify the modifier m (you can also specify the modifier s for the dot (.) to match new lines too ).

See http://php.net/manual/en/reference.pcre.pattern.modifiers.php

Upvotes: 0

Related Questions