Reputation: 313
I have an array, $arr1
with 5 columns as such:
key id name style age whim
0 14 bob big 33 no
1 72 jill big 22 yes
2 39 sue yes 111 yes
3 994 lucy small 23 no
4 15 sis med 24 no
5 16 maj med 87 yes
6 879 Ike larg 56 no
7 286 Jed big 23 yes
This array is in a cache, not a database.
I then have a second array with a list of id values -
$arr2 = array(0=>14, 1=>72, 2=>8790)
How do I filter $arr1
so it returns only the rows with the id values in $arr2
?
I have tried to use filter function (below), array_search, and several others but cannot figure out how to make it work.
$resultingArray = []; // create an empty array to hold rows
$filter_function = function ($row) use ($arr2) {
foreach ($arr2 as $arr) {
return ($row['id'] == $arr);
}
}
Upvotes: 3
Views: 629
Reputation: 21080
Migrating OP's solution from the question to an answer:
I got my code to work as follows:
$arr1 = new CachedStuff(); // get cache $resultingArray = []; // create an empty array to hold rows $filter_function = function ($row) use ($arr2) { return (array_search($row['id'], $arr2)); }; $resultingArrayIDs = $arr1->GetIds($filter_function, $resultingArray);
This gives me two outputs:
$resultingArray
&$resultingArrayIDs
, both of which represent the intersection of$arr1
and$arr2
.
Upvotes: 0
Reputation: 48001
This whole task can be accomplished with just one slick, native function call -- array_uintersect()
.
Because the two compared parameters in the custom callback may come either input array, try to access from the id
column and if there isn't one declared, then fallback to the parameter's value.
Under the hood, this function performs sorting while evaluating as a means to improve execution time / processing speed. I expect this approach to outperform iterated calls of in_array()
purely from a point of minimized function calls.
Code: (Demo)
var_export(
array_uintersect(
$arr1,
$arr2,
fn($a, $b) =>
($a['id'] ?? $a)
<=>
($b['id'] ?? $b)
)
);
Upvotes: 4
Reputation: 4889
This answer was migrated from a deleted duplicate. Revised to make sense independent of context.
Assume the following sample data (named $items
and $select
instead of $arr1
and $arr2
for clarity):
// Source data: A multidimensional array with named keys
$items = [
['id' => 1, 'name' => 'Foo'],
['id' => 3, 'name' => 'Bar'],
['id' => 5, 'name' => 'Maz'],
['id' => 6, 'name' => 'Wut'],
];
// Filter values: A flat array of scalar values
$select = [1, 5, 6];
Then, how do we extract $items
with an id
that matches one of the values in $select
? And further, how do we do that in a manner that scales gracefully for larger datasets? Let's look at the possibilities and compare their weights.
1. Optimizing array_filter()
:
The answer using array_filter
certainly gets the job done. However, there's an in_array
function call made at each iteration. With small datasets, this is hardly an issue. With larger datasets, repeated function calls in an iteration can result in a significant performance hit. Then, for large loops, where possible it's good to "preprocess" data for a lighter operation that uses language constructs in place of the more expensive function calls.
How to avoid in_array()
in loops?
You can "enable" simple index lookups with array_flip($select)
, ie. by swapping keys and values, and then using isset
(language construct, not a function!): isset($select[$id])
. This performs significantly better than repetitions of in_array($id, $select)
for larger datasets; not only for lack of function call, but at each iteration, in_array
scans over the $select
array for matches (over and over). Optimized as follows:
$select = array_flip($select);
$selected_items = array_filter($items, function($item) use ($select) {
return isset($select[$item['id']]);
});
Or using an arrow function that includes the parent scope, ie. doesn't need the use
statement:
$select = array_flip($select);
$selected_items = array_filter($items, fn($item) => isset($select[$item['id']]));
2. Using Key Intersection
One elegant alternative to filtering is key intersection. First, we re-index the array by the desired lookup key using array_column()
, with null
for column key (returns full array instead of a specific column), and with id
for the new index key:
$items_by_id = array_column($items, null, 'id');
This gives you the same source array, but instead of being zero-indexed, it now uses the id
column's value for the index key. Then, we're an array_intersect_key
away from extracting the selection from the source array:
$selected_items = array_intersect_key($items_by_id, array_flip($select));
Here we flip the $select
to intersect keys. Note that array_intersect_key
performs better than approaches using array_intersect
. (Keys are simple!) Result as expected. See demo of this approach. Finally, here's a one-liner (formatted for easy reading) without the throw-away variable:
$selected_items = array_intersect_key(
array_column($items, null, 'id'),
array_flip($select)
);
N.B. The resulting array will retain the actual id
of the item for its index key; instead of the default zero-indexed keys. Keep that in mind if you cross-reference the selected items with your source array later on in your code; and perhaps index items by the proper ID from the beginning.
Comparing these approaches:
array_filter()
incurs 1 iteration of $items
with 1 (anonymous) function call per each array member; and then as many iterations of $select
as there are items, if in_array
is used to compare the current item's ID with each $select
member. (Use key lookups instead.)
The answer using array_search
in a foreach
loop suffers from the same weight, resulting in count($items)
times function calls --- and a whole lot of redundant rounds over the selection/filter array.
The array_key_intersect
method 1. iterates over $items
once (simple reindexing); 2. iterates over $select
once (key/value flip); and 3. iterates over the keys of each for an intersection. array_intersect_key
sorts both lists and then compares them in parallel, and as such is much more efficient than repeated array scans for each value. (This function exists specifically for getting intersections, ie. finding overlaps, after all.)
3. Good Old Foreach Loop
Of course a good old foreach
loop will also work perfectly fine. Again, using array_flip()
and isset()
index lookups, rather than in_array()
or array_search()
. As follows:
$select = array_flip($select);
$selected_items = [];
foreach($items as $key => $val) {
if (isset($select[$val['id']])) {
$selected_items[] = $items[$key];
}
}
I'd instinctively use this for large datasets (or long comparison lists) where "bare bones" performance is called for, going by "simpler is better". However, you likely won't see a big difference between this and the key intersection approach without massive data to process. (If someone has compared these methods for PHP 8.x, please share the benchmark results.)
Upvotes: 1
Reputation: 22773
Something like this should do it, provided I've understood your question and data structure correctly:
$dataArray = [
[ 'key' => 0, 'id' => 14 , 'name' => 'bob' , 'style' => 'big' , 'age' => 33 , 'whim' => 'no' ],
[ 'key' => 1, 'id' => 72 , 'name' => 'jill' , 'style' => 'big' , 'age' => 22 , 'whim' => 'yes' ],
[ 'key' => 2, 'id' => 39 , 'name' => 'sue' , 'style' => 'yes' , 'age' => 111 , 'whim' => 'yes' ],
[ 'key' => 3, 'id' => 994 , 'name' => 'lucy' , 'style' => 'small' , 'age' => 23 , 'whim' => 'no' ],
[ 'key' => 4, 'id' => 15 , 'name' => 'sis' , 'style' => 'med' , 'age' => 24 , 'whim' => 'no' ],
[ 'key' => 5, 'id' => 16 , 'name' => 'maj' , 'style' => 'med' , 'age' => 87 , 'whim' => 'yes' ],
[ 'key' => 6, 'id' => 879 , 'name' => 'Ike' , 'style' => 'larg' , 'age' => 56 , 'whim' => 'no' ],
[ 'key' => 7, 'id' => 286 , 'name' => 'Jed' , 'style' => 'big' , 'age' => 23 , 'whim' => 'yes' ]
];
$filterArray = [14, 72, 879];
$resultArray = array_filter( $dataArray, function( $row ) use ( $filterArray ) {
return in_array( $row[ 'id' ], $filterArray );
} );
However, your question appears to suggest this data might be coming from a database; is that correct? If so, perhaps it's more efficient to pre-filter the results at the database-level. Either by adding a field in the SELECT query, that represents a boolean value whether a row matched your filter ids, or by simply not returning the other rows at all.
Upvotes: 2
Reputation: 9782
As @DecentDabbler mentioned - if the data is coming out of a database, using an IN on your WHERE will allow you to retrieve only the relevant data.
Another way to filter is to use array functions
array_flip flips the resulting array such that the indices into $arr1 indicate the elements in both $arr1 and $arr2
$arr1 = [ [ 'id' => 14, 'name' => 'bob'],
['id' => 72, 'name' => 'jill'],
['id' => 39, 'name' => 'sue'],
['id' => 994, 'name' => 'lucy'],
['id' => 879, 'name'=> 'large']];
$arr2 = [ 14,72,879 ];
$intersection = array_flip(array_intersect(array_column($arr1,'id'),$arr2));
foreach ($intersection as $i) {
var_dump($arr1[$i]);;
}
Upvotes: 1
Reputation: 1981
One way is with foreach
loop with array_search()
$result = [];
foreach ($arr1 as $value) { // Loop thru $arr1
if (array_search($value['id'], $arr2) !== false) { // Check if id is in $arr2
$result[] = $value; // Push to result if true
}
}
// print result
print_r($result);
Upvotes: 1