Reputation: 97

Find the maximum item from a list of unevenly sized lists

I have a nested list and want to find the max value of items in index [1].

Here is my list:

myList = [['apple',2],
          ['banana',4],
          ['orange'],
          ['strawberry',10],
          ['mango']]

I used this function:

 print(max(myList, key=lambda x: x[1]))

But it gives me an error because some of the list don't have item in index [1].

Since my original dataset is really large, It is important for me to use an efficient function to check if index [1] is in myList then find the max.

Is there an efficient way for that? Like a built-in function? Don't want to use for loop if it is possible.

Upvotes: 2

Answers (4)

hygull

Reputation: 8740

Above answers are appreciated.

@Mahsa, you can also get maximum count of fruits from list using list comprehension, map(), filter() and reduce () as follows:

It's nice to use map(), filter(), reduce() and list comprehension in Pythonic programs.

Note: map(), filter(), reduce() are slower than their loop alternatives if the list is huge.

» Using map(), lamda function:

my_list =  [['apple',2],['banana',4],['orange'],['strawberry',10],['mango']];

# Using map() function (1st way)
max_count = max(list(map(lambda item: item[1] if len(item) > 1 else -1, my_list)))
print(max_count) # 10

» Using filter(), reduce():

# Using filter() and reduce()
from functools import reduce

my_list =  [['apple',2],['banana',67],['orange'],['strawberry',10],['mango']];

def get_max(item1, item2):
    if type (item1)  == type([]):
        if item1[1] > item2[1]:
            return item1[1]
    elif item1 > item2[1]:
            return item1
    return item2[1]

filtered_items = list(filter(lambda item: len(item) > 1, my_list))
max_count2 = reduce(get_max, filtered_items)
print(max_count2) # 67

Upvotes: 1

martineau

Reputation: 123393

If you want the maximum value:

import sys
MIN_INT = -sys.maxsize-1  # Largest negative integer.

myList = [['apple', 2],
          ['banana', 4],
          ['orange'],
          ['strawberry', 10],
          ['mango']]

maximum_value = max(myList, key=lambda item: item[1] if len(item) > 1
                                        else MIN_INT)[1]
print(maximum_value)  # -> 10

Upvotes: 2

cs95

Reputation: 402263

`operator.itemgetter` + `max`

For better performance, try pre-filtering before calling max. You can then use operator.itemgetter which runs at C speed.

>>> from operator import itemgetter
>>> max((i for i in lst if len(i) > 1), key=itemgetter(1))
['strawberry', 10]

This should work for numeric data as well as dates (assuming the formatting is consistent) since dates play well when compared lexicographically.

`zip_longest` + `np.argmax`

Another useful option, if you have NumPy installed.

>>> import numpy as np
>>> from itertools import zip_longest
>>> _, y = itertools.zip_longest(*lst, fillvalue=-float('inf'))
>>> lst[np.argmax(y)]
['strawberry', 10]

Disclaimer, this works with numeric data only.

lst = lst * 100000

%timeit max(lst, key=lambda x: x[1] if len(x) > 1 else 0)
175 ms ± 1.19 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

%timeit max((i for i in lst if len(i) > 1), key=itemgetter(-1))
142 ms ± 875 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

%%timeit
_, y = itertools.zip_longest(*lst, fillvalue=-float('inf'))
lst[np.argmax(y)]
136 ms ± 735 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

If you can afford the memory, call max on the listified version of option 1:

%timeit max([i for i in lst if len(i) > 1], key=itemgetter(-1))
128 ms ± 976 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

This seems to be the most performant option by far.

Upvotes: 3

Stephen Rauch

Reputation: 49774

You can use the ternary operator to give a default value when none is present like:

max(myList, key=lambda x: x[1] if len(x) > 1 else 0)

Result:

['strawberry', 10]