Reputation: 6461
My function is extremely slowly! Does anyone know what I did wrong or if it is possible to make it faster?
function explodeTree($array, $delimiter = "_", $baseval = false) {
if(!is_array($array)) return false;
$splitRE = "/" . preg_quote($delimiter, "/") . "/";
$returnArr = array();
foreach ($array as $key => $val) {
$parts = preg_split($splitRE, $val['path'], -1, PREG_SPLIT_NO_EMPTY);
$leafPart = array_pop($parts);
$parentArr = &$returnArr;
foreach ($parts as $part) {
if (!isset($parentArr[$part])) {
$parentArr[$part] = array();
} elseif (!is_array($parentArr[$part])) {
if ($baseval) {
$parentArr[$part] = array("__base_val" => $parentArr[$part]);
} else {
$parentArr[$part] = array();
}
}
$parentArr = &$parentArr[$part];
}
if (empty($parentArr[$leafPart])) {
$parentArr[$leafPart] = $val;
} elseif ($baseval && is_array($parentArr[$leafPart])) {
$parentArr[$leafPart]["__base_val"] = $val;
}
}
return $returnArr;
}
Array before:
array(4) {
[0]=>
array(3) {
["path"]=>
string(30) "Volumes/folder1/horse/fred"
["age"]=>
string(2) "12"
["name"]=>
string(4) "fred"
}
[1]=>
array(3) {
["path"]=>
string(28) "Volumes/folder1/cat/john"
["age"]=>
string(2) "10"
["name"]=>
string(4) "john"
}
[2]=>
array(3) {
["path"]=>
string(27) "Volumes/folder2/cat/sam"
["age"]=>
string(2) "11"
["name"]=>
string(3) "sam"
}
[3]=>
array(3) {
["path"]=>
string(32) "Volumes/folder2/cat/cat/john"
["age"]=>
string(2) "16"
["name"]=>
string(4) "john"
}
}
Array after using function:
array(1) {
["Volumes"]=>
array(2) {
["folder1"]=>
array(2) {
["horse"]=>
array(1) {
["fred"]=>
array(3) {
["path"]=>
string(30) "Volumes/folder1/horse/fred"
["age"]=>
string(2) "12"
["name"]=>
string(4) "fred"
}
}
["cat"]=>
array(1) {
["john"]=>
array(3) {
["path"]=>
string(28) "Volumes/folder1/cat/john"
["age"]=>
string(2) "10"
["name"]=>
string(4) "john"
}
}
}
["folder2"]=>
array(1) {
["cat"]=>
array(2) {
["sam"]=>
array(3) {
["path"]=>
string(27) "Volumes/folder2/cat/sam"
["age"]=>
string(2) "11"
["name"]=>
string(3) "sam"
}
["cat"]=>
array(1) {
["john"]=>
array(3) {
["path"]=>
string(32) "Volumes/folder2/cat/cat/john"
["age"]=>
string(2) "16"
["name"]=>
string(4) "john"
}
}
}
}
}
}
Upvotes: 1
Views: 52
Reputation: 72176
It seems your code produces the same output, no matter if $baseval
is TRUE
or FALSE
. The following code produces the same output, runs fast and gracefully ignores the value of $baseval
too:
function explodeTree(array $array, $delimiter = "_", $baseval = false)
{
# Build the output here
$returnArr = array();
foreach ($array as $item) {
# Split the path using the delimiter, drop the empty segments
$pieces = array_filter(explode($delimiter, $item['path']));
# Turn the path into a nested array
# Each component of the path is the only key on its level
# Build it from the leaf up to the root
$a = array_reduce(
array_reverse($pieces), # Start from the leaf
function (array $carry, $piece) { # Create parent node...
return array($piece => $carry); # ... use the path piece as key
},
$item # Put the item itself as leaf
);
# Combine the new path (nested arrays) into the existing tree
# array_merge_recursive() takes care of all the levels
$returnArr = array_merge_recursive($returnArr, $a);
}
# That's all
return $returnArr;
}
I think the purpose of $baseval
is to put the original properties of an item into a new entry under the key __base_val
if a subsequent path adds children to a leaf node. For example, if the last entry has 'Volumes/folder2/cat/sam/john'
as path
then the output of the current code ends with:
["folder2"]=>
array(1) {
["cat"]=>
array(1) {
["sam"]=>
array(4) {
["path"]=>
string(23) "Volumes/folder2/cat/sam"
["age"]=>
string(2) "11"
["name"]=>
string(3) "sam"
["john"]=>
array(3) {
["path"]=>
string(28) "Volumes/folder2/cat/sam/john"
["age"]=>
string(2) "16"
["name"]=>
string(4) "john"
}
}
}
}
but the expected output should be (I think):
["folder2"]=>
array(1) {
["cat"]=>
array(1) {
["sam"]=>
array(2) {
["__base_val"]=>
array(3) {
["path"]=>
string(23) "Volumes/folder2/cat/sam"
["age"]=>
string(2) "11"
["name"]=>
string(3) "sam"
}
["john"]=>
array(3) {
["path"]=>
string(28) "Volumes/folder2/cat/sam/john"
["age"]=>
string(2) "16"
["name"]=>
string(4) "john"
}
}
}
}
The above function cannot be modified to produce this output. array_merge_recursive()
doesn't change the structure of the arrays it is provided to merge; it just combines them.
A complete rethink and rewrite of the function is needed:
function explodeTree(array $array, $delimiter = "_", $baseval = false)
{
# Build the output here
$returnArr = array();
foreach ($array as $item) {
# Split the path using the delimiter, drop the empty segments
$pieces = array_filter(explode($delimiter, $item['path']));
# Keep a reference to the current node; start from the root of the tree we build
$pos = &$returnArr;
foreach ($pieces as $piece) {
if (! array_key_exists($piece, $pos)) {
# The path component doesn't exist in the tree; add it
$pos[$piece] = array();
} elseif ($baseval && array_key_exists('path', $pos[$piece])) {
# The component exists, it is a leaf node (has 'path' property) and $baseval is TRUE
# Save the existing node content
$val = $pos[$piece];
# Replace it with a new level; store the old leaf in '__base_val'
$pos[$piece] = array('__base_val' => $val);
}
# Advance to the next level
$pos = &$pos[$piece];
}
# If $baseval is TRUE, make sure we don't mix leaf nodes with inner nodes
if ($baseval && ! empty($pos)) {
# The node already has children; put the item in '__base_val'
$pos['__base_val'] = $item;
} else {
# The node was just added; store $item in it
$pos = array_merge($pos, $item);
}
unset($pos);
}
return $returnArr;
}
Upvotes: 2
Reputation: 4419
You might be running into huge amounts of memory reallocations. If you add a $begin = microtime(true);
at the top of the outer foreach
, and then printf('%0.6f', microtime(true) - $begin);
at the end of the loop, do you see the times staying at a roughly stable level or does it get slower as the execution advances?
It looks like you're building a trie. In which case https://github.com/ikcede/PHP-Trie may be of use.
Upvotes: 0