user623990
user623990

Reputation:

How to use array_udiff() to filter out whole rows which are identical between two 2d arrays?

I have two multidimensional arrays that both look something like this:

Array
(
    [0] => Array (
         'id' => 3,
         'other' => 'some string',
         'timestamp' => 2000-01-01 00:00:00
    ),

    [1] => Array (
         'id' => 6,
         'other' => 'another string',
         'timestamp' => 1835-01-01 00:00:00
    )
)

I'm trying to find a way to figure out which elements show up in one array ($b), but not the other ($a) and if there are any elements with changed values. If $a is:

Array
(
    [0] => Array (
         'id' => 3,
         'other' => 'some string',
         'timestamp' => 2000-01-01 00:00:00
    )
)

and $b is:

Array
(
    [0] => Array (
         'id' => 3,
         'other' => 'some string',
         'timestamp' => 2000-01-01 12:12:12
    ),

    [1] => Array (
         'id' => 4,
         'other' => 'some string',
         'timestamp' => 1900-01-01 01:12:23
    )
)

Then the function should return:

Array
(
    [0] => Array (
         'id' => 3,
         'other' => 'some string',
         'timestamp' => 2000-01-01 12:12:12
    ),

    [1] => Array (
         'id' => 4,
         'other' => 'some string',
         'timestamp' => 1900-01-01 01:12:23
    )
)

because the element with id = 3 has been changed (the timestamp field) and the element with id = 4 is new and doesn't appear in the other array.

I've been trying to do this with array_udiff, but I still don't know how it works (it seems to sort both arrays first, but then how does it do comparison?). Is array_udiff the proper method or should I write a custom function?

Upvotes: 0

Views: 7020

Answers (3)

mickmackusa
mickmackusa

Reputation: 47991

array_diff() is not suitable for filtering multidimensional data. Use array_udiff() to compare rows of the two 2d arrays. In this case, you wish to filter out of the $b (I'll call that array $input) array any row which is wholly found in $a (I'll call that array $filterBy). Demo

array_u*() functions must never be designed to perform less than a 3-way comparison because under the hood, PHP is performing sorting as a performance optimization. In other words, returning "1 or 0" or "-1 or 0" or "-1 or 1" may produce corrupted/unreliable results.

$input = [
    ['id' => 3, 'other' => 'some string', 'timestamp' => '2000-01-01 12:12:12'],
    ['id' => 4, 'other' => 'some string', 'timestamp' => '1900-01-01 01:12:23'],
    ['id' => 5, 'other' => 'toker', 'timestamp' => '1900-04-20 00:04:20'],
];

$filterBy = [
    ['id' => 3, 'other' => 'some string', 'timestamp' => '2000-01-01 00:00:00'],
    ['id' => 5, 'other' => 'toker', 'timestamp' => '1900-04-20 00:04:20'],
];

var_dump(
    array_udiff(
        $input,
        $filterBy,
        fn($a, $b) => $a <=> $b
    )
);

Output:

array(2) {
  [0]=>
  array(3) {
    ["id"]=>
    int(3)
    ["other"]=>
    string(11) "some string"
    ["timestamp"]=>
    string(19) "2000-01-01 12:12:12"
  }
  [1]=>
  array(3) {
    ["id"]=>
    int(4)
    ["other"]=>
    string(11) "some string"
    ["timestamp"]=>
    string(19) "1900-01-01 01:12:23"
  }
}

Upvotes: 0

Stratford
Stratford

Reputation: 323

You can use array_udiff and define your own comparison callback. I assume that both arrays has exactly the same structure.

You can define your own callback function as follow:

function comparison(Array $a, Array $b): int {
    if ($a['id']==$b['id'] && $a['other']==$b['other'] && $a['timestamp']==$b['timestamp']){
        return 0
    }else{
        return -1
    }
}

The callback function must return a negative integer if first argument is less than the second; a positive number if it's bigger; or 0 if it's equal. Then, you can return any number different to 0 to indicate that the arguments are different and 0 if they are the equal.

Finally, you should call array_udiffas follow:

array_udiff($a, $b, 'comparison')

And you will get a list of the elements of $a which are not, or are different in $b.

Note that if you want to compare 2 array when one of then has more elements than the other, you should pass as first argument the array with the new elements.

Upvotes: 3

minh nguyen
minh nguyen

Reputation: 1

The return for array_udiff function "data_compare_func" is some function that you define but it must return an integer less than, equal to, or greater than zero so its probably not the right function for your needs. A custom function like this should give you what you need:

// this function loops through both arrays to find a match in the other array
// it will skip entry comparisons when it goes through $arr2 because you already did it the first time
function find_diff($arr1, $arr2) {
    $ret = array();

    // we need to do two loops to find missing entries from both arrays
    $ret = do_loop($arr1, $arr2, $ret, true);
    $ret = do_loop($arr2, $arr1, $ret, false);
    return $ret;
}

// this function does the looping though $arr1 to compare it to entries in $arr2
// you can skip entry comparison if $compare_entries is false
function do_loop($arr1, $arr2, $ret, $compare_entries = true) {
    //look through all of $arr1 for same element in $arr2 based on $id
    for ($i=0;$i<count($arr1);$i++) {
        $id = $arr1[$i]['id'];
        $found = false;

        for ($j=0;$j<count($arr2);$j++) {
            // id match found
            if ($id == $arr2[$j]['id']) {
                $found = true;
                // only compare entries if you need to
                if ($compare_entries) {
                    //check if other field is different
                    if (strcmp($arr1[$i]['other'],$arr2[$j]['other']) != 0) {
                        $ret = add_to_ret($arr1[$i], $ret);
                        break;
                    }
                    //check if timestamp field is different
                    if (strcmp($arr1[$i]['timestamp'],$arr2[$j]['timestamp']) != 0) {
                        $ret = add_to_ret($arr1[$i], $ret);
                        break;
                    }
                } else {
                    break;
                }
            }
        }

        // entry from $arr1[$i] was not found in $arr2
        if (!$found) {
            $ret = add_to_ret($arr1[$i], $ret);
        }
    }
    return $ret;
}


//this function only adds the new entry to $ret if it's ID isn't already in $ret
function add_to_ret($entry, $ret) {

    $id = $entry['id'];

    for ($i=0;$i<count($ret);$i++) {
        if ($id == $ret[$i]['id']) {
            //skip adding, its already in there
            return $ret;
        }
    }
    //add it in
    $ret[] = $entry;
    return $ret;
}

Upvotes: 0

Related Questions