Reputation: 65
I'm searching for this:
<h1> sample string 123.456 - find me </h1>
Please note that its what's between the h1 tags that interests me. Please also not that the string is a variable that contains any combination of numbers, letters and/or characters. Therefore the following would also need to be found between the h1 tags using the same preg_match_all search:
<h1>there are no numbers this time</h1>
or
<h1>this one may be tricky ?!-.</h1>
I've now tried the following:
preg_match_all("/<h1>[\w\d\D\s]+?<\/h1>$/siU", $input, $matches);
print_r($matches);
The script runs... but the $matches
array contains no values when I print_r()
it. It therefore looks like this 'Array ( [0] => Array ( ) ) '
Upvotes: 0
Views: 94
Reputation: 23892
Using a parser is probably your best option. Your question/comments are unclear and contradicting on what you are trying to identify.
$doc = new DOMDocument();
libxml_use_internal_errors(true);
$html = '<h1>Hi</h1><h2>test</h2><strong>Test</strong><h1>More</h1>';
$doc->loadHTML($html);
libxml_use_internal_errors(false);
$h1s = $doc->getElementsByTagName('h1');
foreach ($h1s as $h1) {
echo $h1->nodeValue . "\n";
}
You then could use a regex on the nodeValue
to confirm the value is as expected.
Output:
Hi
More
A regex for what your initial question was could be..
<h1>[a-zA-Z\d]+?<\/h1>
Demo: https://regex101.com/r/lD5wQ3/1
Upvotes: 1
Reputation: 524
preg_match_all("%^<h1>[a-zA-Z0-9\s]*</h1>$%siU", $input, $matches);
This will return text inside <h1>
tags, so if you want tags included, simply do
"<h1>".$result."</h1>"
Upvotes: 0
Reputation: 4952
The question is, what is the expected result? You can try this:
$input = '<h1> Alphanumeric value here </h1>';
preg_match_all("/^<h1>(.*)<\/h1>/su", $input, $matches);
print_r($matches);
Result:
Array
(
[0] => Array
(
[0] => <h1> Alphanumeric value here </h1>
)
[1] => Array
(
[0] => Alphanumeric value here
)
)
Upvotes: 0