Reputation: 2093
I have a list l of lists [l1, ..., ln]
of equal length
I want to compare the l1[k], l2[k], ..., ln[k]
for all k
in len(l1)
and make another list l0
by picking the element that appears most frequently.
So, if l1 = [1, 2, 3]
, l2 = [1, 4, 4]
and l3 = [0, 2, 4]
, then l = [1, 2, 4]
. If there is a tie, I will look at the lists that make up the tie and choose the one in the list with higher priority. Priority is given a priori, each list is given a priority.
Ex. if you have value 1 in lists l1
and l3
, and value 2 in lists l2
and l4
, and 3 in l5
, and lists are ordered according to priority, say l5>l2>l3>l1>l4
, then I will pick 2, because 2 is in l2
that contains an element with highest occurrence and its priority is higher than l1
and l3
.
How do I do this in python without creating a for loop with lots of if/else conditions?
Upvotes: 1
Views: 343
Reputation: 180391
Just transpose the sublists and get the Counter.most_common
element key from each group:
from collections import Counter
lists = [[1, 2, 3],[1, 4, 4],[0, 2, 4]]
print([Counter(sub).most_common(1)[0][0] for sub in zip(*lists)])
If they are individual lists just zip those:
l1, l2, l3 = [1, 2, 3], [1, 4, 4], [0, 2, 4]
print([Counter(sub).most_common(1)[0][0] for sub in zip(l1,l2,l3)])
Not sure how taking the first element from the grouping if there is a tie makes sense as it may not be the one that tied but that is trivial to implement, just get the two most_common and check if their counts are equal:
def most_cm(lists):
for sub in zip(*lists):
# get two most frequent
comm = Counter(sub).most_common(2)
# if their values are equal just return the ele from l1
yield comm[0][0] if len(comm) == 1 or comm[0][1] != comm[1][1] else sub[0]
We also need if len(comm) == 1
in case all the elements are the same or we will get an IndexError.
If you are talking about taking the element that comes from the earlier list in the event of a tie i.e l2 comes before l5 then that is just the same as taking any of the elements that tie.
For a decent number of sublists:
In [61]: lis = [[randint(1,10000) for _ in range(10)] for _ in range(100000)]
In [62]: list(most_cm(lis))
Out[62]: [5856, 9104, 1245, 4304, 829, 8214, 9496, 9182, 8233, 7482]
In [63]: timeit list(most_cm(lis))
1 loops, best of 3: 249 ms per loop
Upvotes: 3
Reputation: 341
You can use the Counter module from the collections library. Using the map
function will reduce your list looping. You will need an if/else statement for the case that there is no most frequent value but only for that:
import collections
list0 = []
list_length = len(your_lists[0])
for k in list_length:
k_vals = map(lambda x: x[k], your_lists) #collect all values at k pos
counts = collections.Counter(k_vals).most_common() #tuples (val,ct) sorted by count
if counts[0][1] > counts[1][1]: #is there a most common value
list0.append(counts[0][0]) #takes the value with highest count
else:
list0.append(k_vals[0]) #takes element from first list
list0
is the answer you are looking for. I just hate using l
because it's easy to confuse with the number 1
Edit (based on comments):
Incorporating your comments, instead of the if/else statement, use a while loop:
i = list_length
while counts[0][1] == counts[1][1]:
counts = collections.Counter(k_vals[:i]).most_common() #ignore the lowest priority element
i -= 1 #go back farther if there's still a tie
list0.append(counts[0][0]) #takes the value with highest count once there's no tie
So the whole thing is now:
import collections
list0 = []
list_length = len(your_lists[0])
for k in list_length:
k_vals = map(lambda x: x[k], your_lists) #collect all values at k pos
counts = collections.Counter(k_vals).most_common() #tuples (val,ct) sorted by count
i = list_length
while counts[0][1] == counts[1][1]: #in case of a tie
counts = collections.Counter(k_vals[:i]).most_common() #ignore the lowest priority element
i -= 1 #go back farther if there's still a tie
list0.append(counts[0][0]) #takes the value with highest count
You throw in one more tiny loop but on the bright side there's no if/else statements at all!
Upvotes: 4
Reputation: 77404
If you are OK taking any one of a set of elements that are tied as most common, and you can guarantee that you won't hit an empty list within your list of lists, then here is a way using Counter
(so, from collections import Counter
):
l = [ [1, 0, 2, 3, 4, 7, 8],
[2, 0, 2, 1, 0, 7, 1],
[2, 0, 1, 4, 0, 1, 8]]
res = []
for k in range(len(l[0])):
res.append(Counter(lst[k] for lst in l).most_common()[0][0])
Doing this in IPython and printing the result:
In [86]: res
Out[86]: [2, 0, 2, 1, 0, 7, 8]
Upvotes: 2
Reputation: 26901
That's what you're looking for:
from collections import Counter
from operator import itemgetter
l0 = [max(Counter(li).items(), key=itemgetter(1))[0] for li in zip(*l)]
Upvotes: 2
Reputation: 886
Try this:
l1 = [1,2,3]
l2 = [1,4,4]
l3 = [0,2,4]
lists = [l1, l2, l3]
print [max(set(x), key=x.count) for x in zip(*lists)]
Upvotes: 1
Reputation: 304
Solution is:
a = [1, 2, 3]
b = [1, 4, 4]
c = [0, 2, 4]
print [max(set(element), key=element.count) for element in zip(a, b, c)]
Upvotes: 2