Reputation: 14205
I'm trying to map a list of fields from a website using php DOMXPath object and I'm struggling on it. I tried to read by absolute position but it breaks when a field is missing and I figured it might be possible to use the field names delimited by the strong tag to find the correct values. How can I do this?
website sample:
<div class="container">
<strong>field1: </strong>
<a href="http://link/1">value1</a>
<a href="http://link/2">value2</a>
<br>
<strong>field2:</strong>
<a href="http://link/3">value3</a>
<br>
<strong>field3:</strong>
<a href="http://link/4">value4</a>
</div>
I need something like:
array = {
field1 =>
array = {
'value1',
'value2'
},
field2 => 'value3',
field3 => 'value4'
}
or
array = {
field1 => 'value1 value2',
field2 => 'value3',
field3 => 'value4'
}
A working example would be most apreciated since I'm just beggining at this subject.
Upvotes: 0
Views: 132
Reputation:
$dom = new DOMDocument();
$dom->loadHTML($str); // Or however you load your HTML
$xpath = new DOMXPath($dom);
$items = $xpath->query('//div[@class = "container"]/strong');
$arr = array();
for($i = 0; $i < $items->length; $i++)
{
$node = $items->item($i);
$name = trim($node->nodeValue, ': ');
$node_items = array();
while(true)
{
$node = $node->nextSibling->nextSibling;
if($node == NULL || $node->nodeName != 'a')
{
break;
}
$node_items[] = $node->nodeValue;
}
$arr[$name] = count($node_items) == 1 ? $node_items[0] : $node_items;
}
Gives the result ($arr
):
Array ( [field1] => Array ( [0] => value1 [1] => value2 ) [field2] => value3 [field3] => value4 )
Upvotes: 1