mb567
mb567

Reputation: 681

Seperate lists based on indices

I have 2 lists:

data = [0, 1, 2, 3, 7, 8, 9, 10]
indices = [1, 1, 0, 0, 0, 2, 1, 0]

I want to append the data to a 2-D array given the indices which correspond to the 2-D array. meaning:

new_list = [[]]*len(set(indices))

Where new_list will results as follows:

new_list = [[2,3,7,10],[0,1,9],[8]]

I am using this code:

for i in range(len(set(indices)):
    for j in range(len(indices)):
        if indices[j] == i:
            new_list[i].append(data[j])
        else:
            pass

However, I get this:

new_list = [[2, 3, 7, 10, 0, 1, 9, 8], [2, 3, 7, 10, 0, 1, 9, 8], [2, 3, 7, 10, 0, 1, 9, 8]]

I am not sure what mistake I am doing, any help is appreciated!

Upvotes: 2

Views: 47

Answers (3)

ShadowRanger
ShadowRanger

Reputation: 155418

You're iterating your indices completely for every value, which is wasteful. You're also multiplying a list of lists, which doesn't do what you expect (it makes a list of many references to the same underlying list). You want to pair up indices and values instead (so you do O(n) work, not O(n**2)), which is what zip was made for, and make your list of empty lists safely (a list of several independent lists):

data = [0, 1, 2, 3, 7, 8, 9, 10]
indices = [1, 1, 0, 0, 0, 2, 1, 0]

# Use max because you really care about the biggest index, not the number of unique indices
# A list comprehension producing [] each time produces a *new* list each time
new_list = [[] for _ in range(max(indices)+1)]

# Iterate each datum and matching index in parallel exactly once
for datum, idx in zip(data, indices):
    new_list[idx].append(datum)

Upvotes: 0

blhsing
blhsing

Reputation: 106598

You can use a dict to map the values to their respective indices, and then use a range to output them in order, so that this will only cost O(n) in time complexity:

d = {}
for i, n in zip(indices, data):
    d.setdefault(i, []).append(n)
newlist = [d[i] for i in range(len(d))]

newlist becomes:

[[2, 3, 7, 10], [0, 1, 9], [8]]

Upvotes: 1

rAntonioH
rAntonioH

Reputation: 149

To get at this, i zipped the data with its index:

>>>data = [0, 1, 2, 3, 7, 8, 9, 10]
>>>indices = [1, 1, 0, 0, 0, 2, 1, 0]

>>>buff = sorted(list(zip(indices,data)))
>>>print(buff)
[(0, 2), (0, 3), (0, 7), (0, 10), (1, 0), (1, 1), (1, 9), (2, 8)]

Then I used the set of unique indices as a way to determine if the data gets included in a new list. This is done with nested list comprehensions.

>>>new_list = list(list((b[1] for b in buff if b[0]==x)) for x in set(indices))    
>>>print(new_list)
[[2, 3, 7, 10], [0, 1, 9], [8]]

I hope this helps.

Upvotes: 0

Related Questions