Return an array where the new elements are the sum of sum of distances between indices with same value in a given array

For each index in a given array I want to find the sum of sum of the distances between that index and other indices which have the same value in a given array.

So for [1,2,1,1,2,3], we would have [5,3,3,4,3,0] because

for index 0) |2-0|+|3-0| = 5, for index 1) |4-1| = 3 ,... , for index 3) |3-0|+|3-2| = 4,... etc..

Here is my attempt which gets TLE for arrays of length near 10000.

How can I make my solution more efficient?

def getDistanceMetrics(arr):
    if len(arr) > 10**5 or len(arr) < 1:
        return -1
    for z in range(len(arr)):
        if arr[z] > 10**9 or arr[z] < 1:
            return -1

    # Write your code here
    hm = collections.defaultdict(set)

    for i,value in enumerate(arr):
        hm[value].add(i)

    ans = []

    for j,val in enumerate(arr):
        tmp = 0
        for element in hm[val]:
            tmp += (j-element) if (j-element) >= 0 else element - j       
        ans.append(tmp)

    return ans

I would also appreciate knowing as to why and where my answer is slow, because to me it looks like O(n), which seems as good as you could do here.

Upvotes: 0

Answers (3)

Arman Babaei

Reputation: 184

for every element in array, you iterate over its hm once.

thus, in the worst case if we have lots of numbers with the same value (say n), for every element, you will iterate over the whole array, which is O(n^2)

Upvotes: 1

Arman Babaei

Reputation: 184

instead of iterating over arr, you can iterate over hm. say we want to calculate the distance where hm of number one is {a1, ..., an} and we want to calculate the answer for a_k then for all i < k we need to add a_k - a_i since a_k is greater. let's save a partial sum called p[i] (= a1 + ... a_i), you can add all the distances at once (that is adding a_k * (k) - p[k]) you can update p[k] as iterating through the set. after this is done we will maintain a partial sum starting from the last element that is p[k] = a_n + a_(n-1) + ... + a_k again we iterate, but this time from the last element to the first element and add p[k] - a_k * (n-k) to the answer for a_k

This is will be done in linear time over all.

# Write your code here
hm = collections.defaultdict(set)

for i,value in enumerate(arr):
    hm[value].add(i)

ans = [0] * len(arr)

for key in hm:
    p = 0
    for i in range(len(hm[key])):
        p += hm[key][i]
        ans[hm[key][i]] += hm[key][i] * (i+1) - p
    p = 0
    for i in range(len(hm[key]) - 1, -1, -1):
        p += hm[key][i]
        ans[hm[key][i]] += p - hm[key][i] * (len(hm[key]) - i)
return ans

Upvotes: 1

Osman Mamun

Reputation: 2882

In [1]: a = [1,2,1,1,2,3]

In [2]: b = [sum(abs(n-m) for m, j in enumerate(a) if i==j) for n, i in enumerat
   ...: e(a)]

In [3]: b
Out[3]: [5, 3, 3, 4, 3, 0]

Upvotes: 1

Return an array where the new elements are the sum of sum of distances between indices with same value in a given array

Answers (3)

Related Questions