Reputation: 2135

Use list comprehension to replace duplicates based on condition using other list

It's probably easier to illustrate it with an example.

A = [1, 1, 1, 1, 2, 2, 2, 2, 3, 3, 3, 3]
B = [0.1, 0.2, 0.3, 0.4, 0.01, 0.02, 0.03, 0.04, 0.001, 0.001, 0.0003, 0.0003]

I have the two lists above.

Each element in A is duplicated a few times. The multiplicity of each element can be different (and they don't have to be ordered as here).

B contains the same number of elements as A. I want to assign to list C the smallest element from each duplicate element in A (where the smallest value comes from the corresponding values in the B list. So for the first 4 elements, it would be 0.1, for the next 4 elements, it is 0.01 in this example, and for the last 4 elements, it is the duplicate value of 0.0003 and this for each of this duplicate elements).

I would like to obtain the following list.

C = [0.1, 0.1, 0.1, 0.1, 0.01, 0.01, 0.01, 0.01, 0.0003, 0.0003, 0.0003, 0.0003]

As the code I'm using already extensively uses list comprehension, I would like to use the same approach.

Is this possible?

Is this advisable?

I am familiar with simple conditions such as

C = A[B < 0.0005]

to give

C = [3]

but don't really have a clear idea on how to proceed here.

Upvotes: 2

Answers (4)

Selcuk

Reputation: 59184

You can use the following method:

>>> A = [1, 1, 1, 1, 2, 2, 2, 2, 3, 3, 3, 3]
>>> B = [0.1, 0.2, 0.3, 0.4, 0.01, 0.02, 0.03, 0.04, 0.001, 0.001, 0.0003, 0.0003]
>>> AB = zip(A, B)
>>> AB_sorted = sorted(AB, key=lambda i: (i[0], -i[1]))
>>> AB_dict = dict(AB_sorted)
>>> C = [AB_dict[i] for i in A]
>>> C
[0.1, 0.1, 0.1, 0.1, 0.01, 0.01, 0.01, 0.01, 0.0003, 0.0003, 0.0003, 0.0003]

This works because when you convert a list of tuples to a dict, duplicate keys are overwritten by the last one.

Upvotes: 2

Augusto Sisa

Reputation: 570

Yes, it is possible in one line.

[min(y for x, y in zip(A, B) if z == x) for z in A]

This produces this list

[0.1, 0.1, 0.1, 0.1, 0.01, 0.01, 0.01, 0.01, 0.0003, 0.0003, 0.0003, 0.0003]

Upvotes: 0

Jeril

Reputation: 8521

If you dont mind using an additional Python library named Pandas you can do the following:

import pandas as pd
A = [1, 1, 1, 1, 2, 2, 2, 2, 3, 3, 3, 3]
B = [0.1, 0.2, 0.3, 0.4, 0.01, 0.02, 0.03, 0.04, 0.001, 0.001, 0.0003, 0.0003]
df = pd.DataFrame([A, B]).T.rename(columns={0: 'A', 1: 'B'})
req_dict = {key: value for key, value in df.groupby('A')['B'].min().iteritems()}
print(df['A'].replace(req_dict))

Output:

[0.1, 0.1, 0.1, 0.1, 0.01, 0.01, 0.01, 0.01, 0.0003, 0.0003, 0.0003, 0.0003]

Upvotes: 1

Tacratis

Reputation: 1055

If you want a one liner, this works, assuming my comment is the correct interpretation:

[min([B[j] for j in [ind for ind,x in enumerate(A) if x==y]]) for y in A]

To break it down, you have the innermost list comprehension going over the indices and values in A, then the next list comprehension goes over all the values in A again (storing them in y), and is used as the condition for the previous list mentioned.
Then you use this list of indices to get all the elements in B (using j) and finally get the min on that list.

enumerate returns the indices and the values, into ind and x, respectively.

Upvotes: 1

Use list comprehension to replace duplicates based on condition using other list

Answers (4)

Related Questions