schaitanya
schaitanya

Reputation: 481

Get the keys for duplicate values in an array

I have the following array:

$myarray = Array("2011-06-21", "2011-06-22", "2011-06-22", "2011-06-23", "2011-06-23", "2011-06-24", "2011-06-24", "2011-06-25", "2011-06-25", "2011-06-26");
var_dump($myarray);

Result:

Array (
    [0] => 2011-06-21
    [1] => 2011-06-22
    [2] => 2011-06-22
    [3] => 2011-06-23
    [4] => 2011-06-23
    [5] => 2011-06-24
    [6] => 2011-06-24
    [7] => 2011-06-25
    [8] => 2011-06-25
    [9] => 2011-06-26
)
  1. Now how can I display the keys with duplicate values? Here the function should NOT return ([0],[9]) since there are no duplicates with the values.
  2. How to find the keys for the same value, eg. for "2011-06-25" it should return [7],[8]

Upvotes: 10

Views: 40893

Answers (11)

mickmackusa
mickmackusa

Reputation: 48100

Use a classic foreach() to unconditionally group&push a newly structured array with all values as keys and all keys as indexed elements in the group's subarray.

Then to filter out the dates that only have one element, call the native array_key_last() function to remove date entries where the highest index in the subarray is 0.

Code: (Demo)

$result = [];
foreach ($myarray as $k => $v) {
    $result[$v][] = $k;
}
var_export(array_filter($result, 'array_key_last'));

Output:

array (
  '2011-06-22' => 
  array (
    0 => 1,
    1 => 2,
  ),
  '2011-06-23' => 
  array (
    0 => 3,
    1 => 4,
  ),
  '2011-06-24' => 
  array (
    0 => 5,
    1 => 6,
  ),
  '2011-06-25' => 
  array (
    0 => 7,
    1 => 8,
  ),
)

Like many other answers on this page, this technique will not be suitable for any data types that cannot become keys (such as iterable type data) or that lose data integrity when assigned as a key (such as float type data).

Upvotes: 0

Francois Deschenes
Francois Deschenes

Reputation: 24989

I'll answer the second question first. You want to use array_keys with the "search_value" specified.

$keys = array_keys($array, "2011-06-29")

In the example below, $duplicates will contain the duplication values while $result will contain ones that are not duplicates. To get the keys, simply use array_keys.

<?php

$array = array(
  'a',
  'a',
  'b',
  'c',
  'd'
);

// Unique values
$unique = array_unique($array);

// Duplicates
$duplicates = array_diff_assoc($array, $unique);

// Unique values
$result = array_diff($unique, $duplicates);

// Get the unique keys
$unique_keys = array_keys($result);

// Get duplicate keys
$duplicate_keys = array_keys(array_intersect($array, $duplicates));

Result:

// $duplicates
Array
(
    [1] => a
)

// $result
Array
(
    [2] => b
    [3] => c
    [4] => d
)

// $unique_keys
Array
(
    [0] => 2
    [1] => 3
    [2] => 4
)

// $duplicate_keys
Array
(
    [0] => 0
    [1] => 1
)

Upvotes: 20

Jeffrey
Jeffrey

Reputation: 23

Here's another solution for question 1 that I have used to get duplicates in a one-dimensional array:

Taking for example this array:

$arr = ['a','b','c','d','a','a','b','c','c'];

I like to avoid loops if possible, so I used array_map to solve this problem:

/**
 * Find multiple occurrences in a one-dimensional array
 *
 * @param array $list List to find duplicates for.
 *
 * @return array
 */
function fetchDuplicates(array $list)
{
    return array_filter(
        array_map(
            function ($el) use ($list) {
                $keysOccur = array_keys($list, $el);
                if (count($keysOccur) > 1) {
                    return $keysOccur;
                }

                return null;
            },
            array_unique($list)
        )
    );

}//end fetchDuplicates()

The array_map function loops every unique element of the $list and uses the original $list to get the array keys of occurrences with array_keys($list, $el). If the resulting occurrence count is more than 1, the index is returned of all the occurrences and replaces the original $el with the array of indices. Otherwise null is returned, which also replaces the $el in the unique variant of $list. Finally the resulting array is filtered for empty values (as is the case with null values). Resulting in the following array:

[
  [
    0,
    4,
    5,
  ],
  [
    1,
    6,
  ],
  [
    2,
    7,
    8,
  ],
]

Upvotes: 0

Another example:

$array = array(
  'a',
  'a',
  'b',
  'b',
  'b'
);
echo '<br/>Array: ';
print_r($array);
// Unique values
$unique = array_unique($array);
echo '<br/>Unique Values: ';
print_r($unique);
// Duplicates
$duplicates = array_diff_assoc($array, $unique);
echo '<br/>Duplicates: ';
print_r($duplicates);
// Get duplicate keys
$duplicate_values = array_values(array_intersect($array, $duplicates));
echo '<br/>duplicate values: ';
print_r($duplicate_values);

Output:

Array :Array ( [0] => a [1] => a [2] => b [3] => b [4] => b ) 
Unique Values :Array ( [0] => a [2] => b ) 
Duplicates :Array ( [1] => a [3] => b [4] => b ) 
Duplicate Values :Array ( [0] => a [1] => a [2] => b [3] => b [4] => b ) 

Upvotes: 0

Trey
Trey

Reputation: 5520

function get_keys_for_duplicate_values($my_arr, $clean = false) {
    if ($clean) {
        return array_unique($my_arr);
    }

    $dups = $new_arr = array();
    foreach ($my_arr as $key => $val) {
      if (!isset($new_arr[$val])) {
         $new_arr[$val] = $key;
      } else {
        if (isset($dups[$val])) {
           $dups[$val][] = $key;
        } else {
           $dups[$val] = array($key);
           // Comment out the previous line, and uncomment the following line to
           // include the initial key in the dups array.
           // $dups[$val] = array($new_arr[$val], $key);
        }
      }
    }
    return $dups;
}

obviously the function name is a bit long;)

Now $dups will contain a multidimensional array keyed by the duplicate value, containing each key that was a duplicate, and if you send "true" as your second argument it will return the original array without the duplicate values.

Alternately you could pass the original array as a reference and it would adjust it accordingly while returning your duplicate array

Upvotes: 23

Sandeep P.
Sandeep P.

Reputation: 19

function getDuplicateValueKeys($my_arr, $clean = false) 
{
    if ($clean) {
        return array_unique($my_arr);
    }
    $dups = array();
    $new_arr = array();
    $dup_vals = array();

    foreach ($my_arr as $key => $value) {
        if (!isset($new_arr[$value])) {
            $new_arr[$value] = $key;
        } else {
            array_push($dup_vals,$value);
        }
    }

    foreach ($my_arr as $key => $value) {
        if (in_array($value, $dup_vals)) {
            if (!isset($dups[$value])) {
                $dups[$value]=array($key);
            }else{
                array_push($dups[$value],$key);
            }
        }
    }

    return $dups;
}

Upvotes: 1

Divey
Divey

Reputation: 1719

I had a similar problem as question #1 from the OP. All I needed were the keys for duplicate values in my original array. Here's what I came up with:

$array = array('yellow', 'red', 'green', 'brown', 'red', 'brown');

$counts = array_count_values($array);
$filtered = array_filter($counts, function($value) {
    return $value != 1;
});
$result = array_keys(array_intersect($array, array_keys($filtered)));

And for the output:

print_r($result);
Array
(
    [0] => 1
    [1] => 3
    [2] => 4
    [3] => 5
)

Upvotes: 2

darkylmnx
darkylmnx

Reputation: 2091

here is a code dude

   $your_array = array(0 => '2011-06-21', 1 => '2011-06-22', 2 => '2011-06-22', 3 => '2011-06-23', 4 =>
'2011-06-23', 5 => '2011-06-24', 6 => '2011-06-24', 7 => '2011-06-25', 8 => '2011-06-25', 9 
=> '2011-06-26', 10 => '2011-06-26', 11 => '2011-06-27', 12 => '2011-06-27', 13 => '2011-06-  
28', 14 => '2011-06-29', 15 => '2011-06-29', 16 => '2011-06-30', 17 => '2011-06-30', 18 => 
'2011-07-01', 19 => '2011-07-01', 20 => '2011-07-02', 21 => '2011-07-02', 22 => '2011-07-03', 
23 => '2011-07-03', 24 => '2011-07-04', 25 => '2011-07-04', 26 => '2011-07-05', 27 => '2011-
07-05', 28 => '2011-07-06', 29 => '2011-07-06', 30 => '2011-07-07', 31 => '2011-07-07');

$keys_of_duplicated = array();
$array_keys = array();

foreach($your_array as $key => $value) {
    //- get the keys of the actual value
    $array_keys = array_keys($your_array, $value);

    //- if there is more then one key collected we register it
    if(count($array_keys) > 1) {
        //- foreach key that have the same value we check if i'ts already registered
        foreach($array_keys as $key_registered) {
            //- if not registered we register it
            if(!in_array($key_registered,  $keys_of_duplicated)) {
                 $keys_of_duplicated[] = $key_registered;
            }
        }
    }
}

var_dump($keys_of_duplicated);

$keys_of_duplicated is now the array that contains the keys of duplicated arrays ;) bye

Upvotes: 1

Dereleased
Dereleased

Reputation: 10087

$array = array(
    '2011-06-21','2011-06-22','2011-06-22','2011-06-23',
    '2011-06-23','2011-06-24','2011-06-24','2011-06-25',
    '2011-06-25','2011-06-26','2011-06-26','2011-06-27',
    '2011-06-27','2011-06-28','2011-06-29','2011-06-29',
    '2011-06-30','2011-06-30','2011-07-01','2011-07-01',
    '2011-07-02','2011-07-02','2011-07-03','2011-07-03',
    '2011-07-04','2011-07-04','2011-07-05','2011-07-05',
    '2011-07-06','2011-07-06','2011-07-07','2011-07-07',
);

function getDupKeys(array $array, $return_first = true, $return_by_key = true) {
    $seen = array();
    $dups = array();

    foreach ($array as $k => $v) {
        $vk = $return_by_key ? $v : 0;
        if (!array_key_exists($v, $seen)) {
            $seen[$v] = $k;
            continue;
        }
        if ($return_first && !array_key_exists($v, $dups)) {
            $dups[$vk][] = $seen[$v];
        }
        $dups[$vk][] = $k;
    }
    return $return_by_key ? $dups : $dups[0];
}

If both optional parameters are true, it returns an array of arrays; the key of each child array will be the value which was not unique, and the values of the array will be all those keys which had that value.

If the first optional parameter is false, then only keys after the first instance of a non-unique value will be returned (i.e., for the given array, each value returns only one key, the second time it occurred, instead of the first).

If the second parameter is optional, then instead of returning an array of arrays, it returns a flat array containing all duplicate keys (exactly which keys it returns are dictated by the prior optional parameter).

Here's a dumpprint_r, cause it's prettier:

print_r(getDupKeys($array));

Array
(
    [2011-06-22] => Array
        (
            [0] => 1
            [1] => 2
        )

    [2011-06-23] => Array
        (
            [0] => 3
            [1] => 4
        )

    [2011-06-24] => Array
        (
            [0] => 5
            [1] => 6
        )

    [2011-06-25] => Array
        (
            [0] => 7
            [1] => 8
        )

    [2011-06-26] => Array
        (
            [0] => 9
            [1] => 10
        )

    [2011-06-27] => Array
        (
            [0] => 11
            [1] => 12
        )

    [2011-06-29] => Array
        (
            [0] => 14
            [1] => 15
        )

    [2011-06-30] => Array
        (
            [0] => 16
            [1] => 17
        )

    [2011-07-01] => Array
        (
            [0] => 18
            [1] => 19
        )

    [2011-07-02] => Array
        (
            [0] => 20
            [1] => 21
        )

    [2011-07-03] => Array
        (
            [0] => 22
            [1] => 23
        )

    [2011-07-04] => Array
        (
            [0] => 24
            [1] => 25
        )

    [2011-07-05] => Array
        (
            [0] => 26
            [1] => 27
        )

    [2011-07-06] => Array
        (
            [0] => 28
            [1] => 29
        )

    [2011-07-07] => Array
        (
            [0] => 30
            [1] => 31
        )

)

print_r(getDupKeys($array, false));

Array
(
    [2011-06-22] => Array
        (
            [0] => 2
        )

    [2011-06-23] => Array
        (
            [0] => 4
        )

    [2011-06-24] => Array
        (
            [0] => 6
        )

    [2011-06-25] => Array
        (
            [0] => 8
        )

    [2011-06-26] => Array
        (
            [0] => 10
        )

    [2011-06-27] => Array
        (
            [0] => 12
        )

    [2011-06-29] => Array
        (
            [0] => 15
        )

    [2011-06-30] => Array
        (
            [0] => 17
        )

    [2011-07-01] => Array
        (
            [0] => 19
        )

    [2011-07-02] => Array
        (
            [0] => 21
        )

    [2011-07-03] => Array
        (
            [0] => 23
        )

    [2011-07-04] => Array
        (
            [0] => 25
        )

    [2011-07-05] => Array
        (
            [0] => 27
        )

    [2011-07-06] => Array
        (
            [0] => 29
        )

    [2011-07-07] => Array
        (
            [0] => 31
        )

)

print_r(getDupKeys($array, true, false));

Array
(
    [0] => 1
    [1] => 2
    [2] => 3
    [3] => 4
    [4] => 5
    [5] => 6
    [6] => 7
    [7] => 8
    [8] => 9
    [9] => 10
    [10] => 11
    [11] => 12
    [12] => 14
    [13] => 15
    [14] => 16
    [15] => 17
    [16] => 18
    [17] => 19
    [18] => 20
    [19] => 21
    [20] => 22
    [21] => 23
    [22] => 24
    [23] => 25
    [24] => 26
    [25] => 27
    [26] => 28
    [27] => 29
    [28] => 30
    [29] => 31
)

print_r(getDupKeys($array, false, false));

Array
(
    [0] => 2
    [1] => 4
    [2] => 6
    [3] => 8
    [4] => 10
    [5] => 12
    [6] => 15
    [7] => 17
    [8] => 19
    [9] => 21
    [10] => 23
    [11] => 25
    [12] => 27
    [13] => 29
    [14] => 31
)

Upvotes: 0

hakre
hakre

Reputation: 198237

I really like Francois answer, here is something I came up with that preserves keys. I'll answer the first question first:

$array = array('2011-06-21', '2011-06-22', '2011-06-22');
/**
 * flip an array like array_flip but
 * preserving multiple keys per an array value
 * 
 * @param array $a
 * @return array
 */
function array_flip_multiple(array $a) {
    $result = array();
    foreach($a as $k=>$v)
        $result[$v][]=$k
        ;
    return $result;
}

$hash = array_flip_multiple($array);

// filter $hash based on your specs (2 or more)
$hash = array_filter($hash, function($items) {return count($items) > 1;});

// get all remaining keys
$keys = array_reduce($hash, 'array_merge', array());

var_dump($array, $hash, $keys);

output is:

# original array
array(3) {
  [0]=>
  string(10) "2011-06-21"
  [1]=>
  string(10) "2011-06-22"
  [2]=>
  string(10) "2011-06-22"
}

# hash (filtered)
array(1) {
  ["2011-06-22"]=>
  array(2) {
    [0]=>
    int(1)
    [1]=>
    int(2)
  }
}

# the keys
array(2) {
  [0]=>
  int(1)
  [1]=>
  int(2)
}

So now the second question:

Just use the $hash to obtain the keys for the value:

var_dump($hash['2011-06-22']); returns the keys.

Benefit is, if you need to check multiple values, data is already stored in the hash and available for use.

Upvotes: 1

Dogbert
Dogbert

Reputation: 222448

$array = array(0 => "1", 1 => "1", 2 => "2", 3 => "3");
$count = array();
foreach($array as $key => $value) {
  if(!isset($count[$value])) {
    $count[$value] = 0;
  }
  $count[$value]++;
}


$result = array_filter($count, function($value) {
  return $value > 1;
});

$result = array_keys($result);

var_dump($result);

Output

array(1) {
  [0]=>
  int(1)
}

Upvotes: 0

Related Questions