user3353499
user3353499

Reputation: 137

Find duplicate elements in associative array with non consecutive index

i'm starting with bash and code in general and have a lot of pain to write this, can anyone tell me if it will work in every case please, maybe someone have a better approach? Many thanks in advance.

array=( [0]=24 [1]=24 [5]=10 [6]=24 [10]=24 [12]=12 )
KEYS=(${!array[@]})

for i in "${!KEYS[@]}"; do

    for j in "${!KEYS[@]}"; do

    if [[ $i -eq $j ]]; then
        continue
    fi

    if [[ ${array[${KEYS[$i]}]} -eq ${array[${KEYS[$j]}]} ]]; then

        duplicate+=( ${KEYS[$j]} )
    fi
    done

done

uniq=($(printf "%s\n" "${duplicate[@]}" | sort -u)); 

echo "${uniq[@]}"

EDIT:

my expected output is an array containing index of duplicated elements.

Upvotes: 4

Views: 1946

Answers (4)

CliffordVienna
CliffordVienna

Reputation: 8235

This approach has linear complexity (assuming constant time array lookups) instead of the quadratic complexity of the cascaded loops:

array=( [0]=24 [1]=24 [5]=10 [6]=24 [10]=24 [12]=12 )
ref=( )

for i in "${!array[@]}"; do
    ref[array[i]]="${ref[array[i]]}$i "
done

for i in "${!ref[@]}"; do
    [[ "${ref[i]% }" == *" "* ]] && echo "$i @ ${ref[i]% }"
done

The first loop copies the data from array[] to ref[], switching the roles of key and value and concatenating the new values in case of a collision (with blanks between the individual entries). So after the first loop ref[] will have the following content:

ref=( [10]="5 " [12]="12 " [24]="0 1 6 10 " )

The second loop prints the entries from ref[], but skips all entries that do not contain a blank, not counting trailing blanks, thus only printing those that point to two or more entries in array[].

Edit: Using slightly simpler version as suggested by Adrian in the comments.

Upvotes: 4

anubhava
anubhava

Reputation: 785246

You can use:

uarr=($(for i in "${!array[@]}";do echo $i ${array[$i]}; done|awk 'a[$2]++{printf "%s ",$1}'))

Which gives:

set | grep uarr
uarr=([0]="1" [1]="6" [2]="10")

Upvotes: 1

omoman
omoman

Reputation: 859

This is my approach using c-style for loops which in the end will print all the repeated numbers in the array.

arr=( 1 2 3 4 5 6 1 2 3 4 5 6 0 1 3 )
repeats=()

for (( i=0; i < ${#arr[@]}; ++i )); do
   for (( j=i+1; j < ${#arr[@]}; ++j )); do
       if [ ${arr[i]} -eq ${arr[j]} ]; then
           repeats+=( ${arr[i]} )
            break
       fi  
   done
done

echo ${repeats[@]} | grep -o . | sort -u  

Upvotes: 1

Josh Jolly
Josh Jolly

Reputation: 11786

What is your $KEYS array for? You store the indices of $array inside it, but then you only use to reference those indices - this is unnecessary. Here is a script which does the same as your original post but without $KEYS:

array=( [0]=24 [1]=24 [5]=10 [6]=24 [10]=24 [12]=12 )

for i in "${!array[@]}"; do
    for j in "${!array[@]}"; do
        [ "$i" -eq "$j" ] && continue
        [ "${array[$i]}" -eq "${array[$j]}" ] && duplicate+=("$j")
    done
done

echo $(printf "%s\n" "${duplicate[@]}" | sort -u)

This prints out the indices of any duplicate values in your original array, all on one line - if you want them on separate lines, just put double quotes around the echo statement:

echo "$(printf "%s\n" "${duplicate[@]}" | sort -u)"

Upvotes: 2

Related Questions