LargeTuna
LargeTuna

Reputation: 2824

Remove all non duplicates from Array php

I have an array, with php I need to remove all NON duplicates on the "listingCode" from this array. For instance:

Array
(
    [0] => Array
    (
        [name] => Supplier A
        [listingCode] => ABC
    )
    [1] => Array
    (
        [name] => Supplier B
        [listingCode] => ABC
    )
    [2] => Array
    (
        [name] => Supplier B
        [listingCode] => DEF
    )
    [3] => Array
    (
        [name] => Supplier C
        [listingCode] => XYZ
    )
    [4] => Array
    (
        [name] => Supplier D
        [listingCode] => BBB
    )
    [5] => Array
    (
        [name] => Supplier E
        [listingCode] => ABCDEF
    )
    [6] => Array
    (
        [name] => Supplier F
        [listingCode] => ABCDEF
    )
)

I have 1.2M records in this array. Basically when all is said and done, I just want to have elements 0, 1, 5, 6 left in the array. Is this possible?

Basically all of this data comes from 3 tables. I only want to display suppliers where any of the listingCode's may be duplicates. For instance listingCode 1,2,6,7 are duplicates, therefore display Supplier A,B,E,F

Supplier
----------------------
ID| Supplier Name
1 | Supplier A
2 | Supplier B
3 | Supplier B
4 | Supplier C
5 | Supplier D
6 | Supplier E
7 | Supplier F

Product
----------------------
ID| Product Name | Supplier ID
1 | ABC          | 1
2 | DEF          | 2
3 | GHI          | 3
4 | JKL          | 4
5 | MNO          | 5
6 | PQR          | 6 
7 | STU          | 7

Listing
----------------------
ID| Listing Code | Product ID
1 | ABC          | 1
2 | ABC          | 2
3 | DEF          | 3
4 | XYZ          | 4
5 | BBB          | 5
6 | ABCDEF       | 6 
7 | ABCDEF       | 7

Thanks

Upvotes: 1

Views: 368

Answers (2)

Don't Panic
Don't Panic

Reputation: 41810

This doesn't exactly answer your question, but I decided to try to offer an alternative approach that will generate a data structure that may be more usable.

foreach ($supplier_products as $item) {
    $products[$item['productName']][] = $item['name'];
}

This will yield an array with product names as keys and arrays of suppliers for each product name as values. Then if you want only the products with multiple suppliers, you can just count the suppliers in array filter:

$duplicate_products = array_filter($products, function($product) {
    return count($product) > 1; 
});

This will end up with an array like:

Array ( 
    [ABC] => Array ( 
        [0] => Supplier A 
        [1] => Supplier B 
    )
    [ABCDEF] => Array (
        [0] => Supplier E 
        [1] => Supplier F
    )
)

which, granted, is not exactly what you asked for, but in my opinion will be easier to work with.


After your edit, I think this query will get you a list of suppliers with duplicate listing codes:

SELECT
    s.supplier_name
FROM
    listing l1 
    INNER JOIN listing l2 ON l1.listing_code = l2.listing_code AND  l1.id != l2.id
    INNER JOIN product p ON l1.product_id = p.id
    INNER JOIN supplier s on p.supplier_id = s.id
GROUP BY
    s.supplier_name

Upvotes: 1

Mark Baker
Mark Baker

Reputation: 212412

array_filter() is a standard PHP function that allows you to return a subset of array values based on a callback condition

$data = [
    ['name' => 'Supplier A', 'productName' => 'ABC'],
    ['name' => 'Supplier B', 'productName' => 'ABC'],
    ['name' => 'Supplier B', 'productName' => 'DEF'],
    ['name' => 'Supplier C', 'productName' => 'XYZ'],
    ['name' => 'Supplier D', 'productName' => 'BBB'],
    ['name' => 'Supplier E', 'productName' => 'ABCDEF'],
    ['name' => 'Supplier F', 'productName' => 'ABCDEF']
];

$result = array_filter(
    $data,
    function($value) use ($data) {
        return count(array_filter(
            $data,
            function ($match) use ($value) {
                return $match['productName'] === $value['productName'];
            }
        )) > 1;
    }
);
var_dump($result);

This loops through each array element in turn, executing a callback that counts how many duplicates there are in the original array (based on productName) and returns a true if there is more than 1 matching record, indicating that this should be retained after the filtering

and yes, it does preserve the original keys


However, an array with 1.2M records is taking an enormous amount of PHP's precious memory, and the filtering will be incredibly slow with that volume of data.... it would be far better doing this via SQL.

Upvotes: 3

Related Questions