GtDriver
GtDriver

Reputation: 65

Struggling to match a string using preg_match_all()

I'm searching for this:

<h1> sample string 123.456 - find me </h1>

Please note that its what's between the h1 tags that interests me. Please also not that the string is a variable that contains any combination of numbers, letters and/or characters. Therefore the following would also need to be found between the h1 tags using the same preg_match_all search:

<h1>there are no numbers this time</h1>

or

<h1>this one may be tricky ?!-.</h1>

I've now tried the following:

preg_match_all("/<h1>[\w\d\D\s]+?<\/h1>$/siU", $input, $matches);
print_r($matches);

The script runs... but the $matches array contains no values when I print_r() it. It therefore looks like this 'Array ( [0] => Array ( ) ) '

Upvotes: 0

Views: 94

Answers (4)

fsn
fsn

Reputation: 539

below gets all three of your strings:

<h1>\s?[a-z0-9\s?!.]*<\/h1> 

Upvotes: 0

chris85
chris85

Reputation: 23892

Using a parser is probably your best option. Your question/comments are unclear and contradicting on what you are trying to identify.

$doc = new DOMDocument();
libxml_use_internal_errors(true);
$html = '<h1>Hi</h1><h2>test</h2><strong>Test</strong><h1>More</h1>';
$doc->loadHTML($html);
libxml_use_internal_errors(false);
$h1s = $doc->getElementsByTagName('h1');
foreach ($h1s as $h1) {
    echo $h1->nodeValue . "\n";
}

You then could use a regex on the nodeValue to confirm the value is as expected.

Output:

Hi
More

A regex for what your initial question was could be..

<h1>[a-zA-Z\d]+?<\/h1>

Demo: https://regex101.com/r/lD5wQ3/1

Upvotes: 1

Luka Govedič
Luka Govedič

Reputation: 524

preg_match_all("%^<h1>[a-zA-Z0-9\s]*</h1>$%siU", $input, $matches);

This will return text inside <h1> tags, so if you want tags included, simply do

"<h1>".$result."</h1>"

Upvotes: 0

odan
odan

Reputation: 4952

The question is, what is the expected result? You can try this:

$input = '<h1> Alphanumeric value here </h1>';
preg_match_all("/^<h1>(.*)<\/h1>/su", $input, $matches);
print_r($matches);

Result:

Array
(
    [0] => Array
        (
            [0] => <h1> Alphanumeric value here </h1>
        )

    [1] => Array
        (
            [0] =>  Alphanumeric value here 
        )

)

Upvotes: 0

Related Questions