Reputation: 2626

PHP De-duplicate a multidimensional array

Firstly, I realise this may appear as a duplicate as I have read a number of questions on a similar topic (1, 2) but I'm struggling to see how to re-architect the code base to fit my senario.

I am attempting to take an existing multi-dimensional array and remove any nodes that have a duplicate in a specific field. Here is dataset I am working with:

array(3) {
  [0]=>
  array(3) {
    ["company"]=>
    string(9) "Company A"
    ["region"]=>
    string(4) "EMEA"
    ["ctype"]=>
    string(8) "Customer"
  }
  [1]=>
  array(3) {
    ["company"]=>
    string(9) "Company A"
    ["region"]=>
    string(4) "EMEA"
    ["ctype"]=>
    string(8) "Customer"
  }
  [2]=>
  array(3) {
    ["company"]=>
    string(9) "Company C"
    ["region"]=>
    string(4) "EMEA"
    ["ctype"]=>
    string(8) "Customer"
  }
}

If this wasn't a multi-dimensional array would use in_array() to see if the dataset['company'] existed. If not I'd add it to my $unique array, something like this:

$unique = array();

foreach ($dataset as $company) {
  $company_name = $company['company'];

  if ( !in_array($company_name, $unique) ) {
    array_push($unique, $company_name);
  }
}
var_dump($unique);

But I'm unsure how to traverse the muti-dimensional array to get to the ['company'] data to see if it exists (as it is the only item I need to check to see if it already exists).

I am looking to output exactly the same data as the initial dataset, just with the duplicate removed. Please can you point me in the right direction?

Upvotes: 0

Answers (4)

Spoke44

Reputation: 988

To rebuild an array without duplicates :

$result = array();
foreach($datas as $data){
  foreach($data as $key => $value){
    $result[$key][$value] = $value;
  }
}

print_r($result);

OUTPUT :

Array
(
    [company] => Array
        (
            [Company A] => Company A
            [Company C] => Company C
        )

    [region] => Array
        (
            [EMEA] => EMEA
        )

    [ctype] => Array
        (
            [Customer] => Customer
        )

)

Keeping the same architecture :

$datas = array(
  array(
    "company"=>"Company A",
    "region"=>"EMEA",
    "ctype"=>"Customer"
  ),
  array(
    "company"=>"Company A",
    "region"=>"EMEA",
    "ctype"=>"Customer"
  ),
  array(
    "company"=>"Company C",
    "region"=>"EMEA",
    "ctype"=>"Customer"
  )
);

function removeDuplicateOnField($datas, $field){
  $result = array();

  foreach($datas as $key => &$data){
      if(isset($data[$field]) AND !isset($result[$data[$field]])){
        $result[$data[$field]] = $data;
      }
      else 
        unset($datas[$key]);
  }
  return $datas;
}

$result = removeDuplicateOnField($datas, "company");

print_r($result);

Upvotes: 0

GNewton

Reputation: 90

What you seem to be describing is something PHP can already cater for. Have you heard of the array_unique function before? It doesn't work recursively, but while browsing through the PHP docs someone has already created a function which will work.

recursive array unique for multiarrays

function super_unique($array)
{
  $result = array_map("unserialize", array_unique(array_map("serialize", $array)));

  foreach ($result as $key => $value)
  {
    if ( is_array($value) )
    {
      $result[$key] = super_unique($value);
    }
  }

  return $result;
}

Let me know if this works, as I am currently out the office at the moment.

Upvotes: 0

markcial

Reputation: 9323

Use array_filter with the use keyword and a pass by reference array.

>>> $data
=> [
       [
           "company" => "Company A",
           "region"  => "EMEA",
           "ctype"   => "Customer"
       ],
       [
           "company" => "Company A",
           "region"  => "EMEA",
           "ctype"   => "Customer"
       ],
       [
           "company" => "Company C",
           "region"  => "EMEA",
           "ctype"   => "Customer"
       ]
   ]
$whitelist = [];

array_filter($data, function ($item) use (&$whitelist) { 
  if (!in_array($item['company'], $whitelist)) { 
    $whitelist[] = $item['company']; 
    return true; 
  }; 
  return false; 
});

=> [
       0 => [
           "company" => "Company A",
           "region"  => "EMEA",
           "ctype"   => "Customer"
       ],
       2 => [
           "company" => "Company C",
           "region"  => "EMEA",
           "ctype"   => "Customer"
       ]
   ]

Upvotes: 1

u_mulder

Reputation: 54831

Store already checked companies in some side-array:

$unique = array();
$companies = array();

foreach ($dataset as $company) {
    $company_name = $company['company'];

    if ( !in_array($company_name, $companies) ) {
        array_push($unique, $company);
        array_push($companies, $company_name);
    }
}

var_dump($unique);

Upvotes: 1

PHP De-duplicate a multidimensional array

Answers (4)

Related Questions