Reputation: 71
Hope someone could help me because I'm not aware with regex.
I need to extract data included in a classic html page to a PHP array.
The HTML code is as below :
<html>
...some html code...
<div data-companycounter="9879" data-code="A" data-seatcounter="9783" class="">
...some html code...
<div data-companycounter="9879" data-code="B" data-seatcounter="9784" class="">
...some html code...
<div data-companycounter="11397" data-code="A" data-seatcounter="11509" class="">
...some html code...
</html>
And I would like to extract some data in an array like this :
$companycounter = [
9879 => [
'A' => 9783,
'B' => 9784,
],
11397 => [
'A' => 11509
]
];
Hope it's clear enought. Thank for those who can help me
Upvotes: 1
Views: 269
Reputation: 1
As said in comments Use HTML parser
instead of regex it would be easy to extract data from HTML.
So just intial an object $doc
from DOMDocument
class.
get all divs by using method getElementsByTagName
, Then iterate over them and take the desired company's information attributes, Set them to $companycounter
array in a specific order.
$html =
'<div data-companycounter="9879" data-code="A" data-seatcounter="9783"/>
<div data-companycounter="9879" data-code="B" data-seatcounter="9784"/>
<div data-companycounter="11397" data-code="A" data-seatcounter="11509"/>';
$doc = new DOMDocument();
$doc->loadHTML($html);
$divs = $doc->getElementsByTagName('div');
$companycounter = [];
foreach ($divs as $div) {
$counter = $div->attributes->item(0)->value; //data-companycounter
$code = $div->attributes->item(1)->value; //data-code
$seatcounter = $div->attributes->item(2)->value; //data-seatcounter
$companycounter[$counter][$code] = $seatcounter;
}
echo "<pre>";
print_r($companycounter);
The Output as expected:
/*
Array
(
[9879] => Array
(
[A] => 9783
[B] => 9784
)
[11397] => Array
(
[A] => 11509
)
)
Upvotes: 0
Reputation:
function custom_parse_html($html)
{
$company_counter = [];
preg_match_all('/<div data-companycounter="(.*)" data-code="(.*)" data-seatcounter="(.*)" (.*)>/im', $html, $matches);
foreach ($matches[0] as $key => $arr) {
// $matches[1][$key] => data-companycounter
// $matches[2][$key] => data-code
// $matches[3][$key] => data-seatcounter
if (!empty($company_counter[$matches[1][$key]])) {
$company_counter[$matches[2][$key]] = $matches[3][$key];
} else {
$company_counter[$matches[1][$key]] = [$matches[2][$key] => $matches[3][$key]];
}
}
return $company_counter;
}
Upvotes: 1