dearsina
dearsina

Reputation: 5210

php multidimensional array merge with same key and value

I've spent hours trying to find the answer to this question, but I'm struggling. I'm reasonably familiar with PHP and the various in-built functions, and can build a complex foreach() loop to do this, but I thought I'd ask to see if anyone has a smarter solution to my problem.

I have the following simplified example array with three "rows" (the real array is usually a lot bigger and more complex, but the issue is the same).

$rows[] = [
    "widget_id" => "widget1",
    "size" => "large",
    "item" => [
        "item_id" => "item1",
        "shape" => "circle",
        "paint" => [
            "paint_id" => "paint1",
            "colour" => "red",
        ]
    ]
];

# Exactly the same as above, except the "paint" child array is different
$rows[] = [
    "widget_id" => "widget1",
    "size" => "large",
    "item" => [
        "item_id" => "item1",
        "shape" => "circle",
        "paint" => [
            "paint_id" => "paint2",
            "colour" => "green",
        ]
    ]
];

# Same children ("item" and "paint") as the first row, but different parents ("widget_id" is different)
$rows[] = [
    "widget_id" => "widget2",
    "size" => "medium",
    "item" => [
        "item_id" => "item1",
        "shape" => "circle",
        "paint" => [
            "paint_id" => "paint1",
            "colour" => "red",
        ]
    ]
];

What I'm trying to get to is the following output:

[[
    "widget_id" => "widget1",
    "size" => "large",
    "item" => [
        "item_id" => "item1",
        "shape" => "circle",
        "paint" => [[
            "paint_id" => "paint1",
            "colour" => "red",
        ],[
            "paint_id" => "paint2",
            "colour" => "green",
        ]]
    ]
],[
    "widget_id" => "widget2",
    "size" => "medium",
    "item" => [
        "item_id" => "item1",
        "shape" => "circle",
        "paint" => [
            "paint_id" => "paint1",
            "colour" => "red",
        ]
    ]
]]

Basically, when two rows share the same key and values, merge them. When the key is the same, but the value is different, keep both values and put them in a numerical array under the key (sort of like how array_merge_recursive does it).

The challenge is that the values can themselves be arrays and there is an unknown number of levels. Is there a smart and effective way of doing this, or do I have to resort to a heavy duty foreach loop?

Thank you for browsing, hope there are some people more clever than me reading this!

Upvotes: 2

Views: 2276

Answers (2)

dearsina
dearsina

Reputation: 5210

Here's my own attempt. I think I prefer AymDev's version though, a lot more succinct. I wonder which is faster.

class ComplexMerge{
    /**
     * Checks to see whether an array has sequential numerical keys (only),
     * starting from 0 to n, where n is the array count minus one.
     *
     * @link https://codereview.stackexchange.com/questions/201/is-numeric-array-is-missing/204
     *
     * @param $arr
     *
     * @return bool
     */
    private static function isNumericArray($arr)
    {
        if(!is_array($arr)){
            return false;
        }
        return array_keys($arr) === range(0, (count($arr) - 1));
    }

    /**
     * Given an array, separate out
     * array values that themselves are arrays
     * and those that are not.
     *
     * @param array $array
     *
     * @return array[]
     */
    private static function separateOutArrayValues(array $array): array
    {
        $valuesThatAreArrays = [];
        $valuesThatAreNotArrays = [];

        foreach($array as $key => $val){
            if(is_array($val)){
                $valuesThatAreArrays[$key] = $val;
            } else {
                $valuesThatAreNotArrays[$key] = $val;
            }
        }

        return [$valuesThatAreArrays, $valuesThatAreNotArrays];
    }

    /**
     * Groups row keys together that have the same non-array values.
     * If every row is already unique, returns NULL.
     *
     * @param $array
     *
     * @return array|null
     */
    private static function groupRowKeysWithSameNonArrayValues($array): ?array
    {
        foreach($array as $key => $row){
            # Separate out the values that are arrays and those that are not
            [$a, $v] = self::separateOutArrayValues($row);

            # Serialise the values that are not arrays and create a unique ID from them
            $uniqueRowId = md5(serialize($v));

            # Store all the original array keys under the unique ID
            $deduplicatedArray[$uniqueRowId][] = $key;
        }

        # If every row is unique, there are no more rows to combine, and our work is done
        if(!$a && count($array) == count($deduplicatedArray)){
            return NULL;
        }

        return $deduplicatedArray;
    }

    private static function mergeRows(array $array): array
    {
        # Get the grouped row keys
        if(!$groupedRowKeys = self::groupRowKeysWithSameNonArrayValues($array)){
            //If there are no more rows to merge
            return $array;
        }

        foreach($groupedRowKeys as $uniqueRowId => $keys){

            foreach($keys as $id => $key){
                # Separate out the values that are arrays and those that are not
                [$valuesThatAreArrays, $valuesThatAreNotArrays] = self::separateOutArrayValues($array[$key]);
                //We're using the key from the grouped row keys array, but using it on the original array

                # If this is the first row from the group, throw in the non-array values
                if(!$id){
                    $unique[$uniqueRowId] = $valuesThatAreNotArrays;
                }

                # For each of the values that are arrays include them back in
                foreach($valuesThatAreArrays as $k => $childArray){
                    $unique[$uniqueRowId][$k][] = $childArray;
                    //Wrap them in a numerical key array so that only children and siblings are have the same parent-child relationship
                }
            }
        }

        # Go deeper
        foreach($unique as $key => $val){
            foreach($val as $k => $valuesThatAreNotArrays){
                if(self::isNumericArray($valuesThatAreNotArrays)){
                    $unique[$key][$k] = self::mergeRows($unique[$key][$k]);
                }
            }
        }

        # No need to include the unique row IDs
        return array_values($unique);
    }

    public static function normalise($array): ?array
    {
        $array = self::mergeRows($array);
        return $array;
    }
}

Usage:

$array = ComplexMerge::normalise($array);

Demo

Upvotes: 0

AymDev
AymDev

Reputation: 7609

I achieved to get the expected array structure with the following function, I hope comments are explicit on what's inside:

function complex_merge(array $arr): array
{
    // Grouped items
    $result = [];
    $iterationKey = 0;

    // Loop through every item
    while (($element = array_shift($arr)) !== null) {
        // Save scalar values as is
        $scalarValues = array_filter($element, 'is_scalar');

        // Save array values in an array
        $arrayValues = array_map(fn(array $arrVal) => [$arrVal], array_filter($element, 'is_array'));
        $arrayValuesKeys = array_keys($arrayValues);

        $result[$iterationKey] = array_merge($scalarValues, $arrayValues);

        // Compare with remaining items
        for ($i = 0; $i < count($arr); $i++) {
            $comparisonScalarValues = array_filter($arr[$i], 'is_scalar');

            // Scalar values are same, add the array values to the containing arrays
            if ($scalarValues === $comparisonScalarValues) {
                $comparisonArrayValues = array_filter($arr[$i], 'is_array');
                foreach ($arrayValuesKeys as $arrayKey) {
                    $result[$iterationKey][$arrayKey][] = $comparisonArrayValues[$arrayKey];
                }

                // Remove matching item
                array_splice($arr, $i, 1);
                $i--;
            }
        }

        // Merge array values
        foreach ($arrayValuesKeys as $arrayKey) {
            $result[$iterationKey][$arrayKey] = complex_merge($result[$iterationKey][$arrayKey]);

            // array key contains a single item, extract it
            if (count($result[$iterationKey][$arrayKey]) === 1) {
                $result[$iterationKey][$arrayKey] = $result[$iterationKey][$arrayKey][0];
            }
        }

        // Increment result key
        $iterationKey++;
    }
    return $result;
}

Just pass $rows to the function, quick checkup of the values:

echo '<pre>' . print_r(complex_merge($rows), true) . '</pre>';

/*
Displays:
Array
(
    [0] => Array
        (
            [widget_id] => widget1
            [size] => large
            [item] => Array
                (
                    [item_id] => item1
                    [shape] => circle
                    [paint] => Array
                        (
                            [0] => Array
                                (
                                    [paint_id] => paint1
                                    [colour] => red
                                )

                            [1] => Array
                                (
                                    [paint_id] => paint2
                                    [colour] => green
                                )

                        )

                )

        )

    [1] => Array
        (
            [widget_id] => widget2
            [size] => medium
            [item] => Array
                (
                    [item_id] => item1
                    [shape] => circle
                    [paint] => Array
                        (
                            [paint_id] => paint1
                            [colour] => red
                        )

                )

        )

)
*/

Upvotes: 1

Related Questions