Sun Hao Lun
Sun Hao Lun

Reputation: 313

Pandas: Find the max value in one column containing lists

I have a dataframe like this:

fly_frame:
          day    plcae
0  [1,2,3,4,5]       A
1    [1,2,3,4]       B
2        [1,2]       C
3     [1,2,3,4]      D

If I want to find the max value in each entry in the day column.

For example:

fly_frame:
          day    plcae
0           5       A
1           4       B
2           2       C
3           4       D

What should I do?
Thanks for your help.

Upvotes: 11

Views: 7803

Answers (4)

timgeb
timgeb

Reputation: 78700

I suggest bringing your dataframe into a better format first.

>>> df
               day plcae
0  [1, 2, 3, 4, 5]     A
1     [1, 2, 3, 4]     B
2           [1, 2]     C
3     [1, 2, 3, 4]     D
>>> 
>>> df = pd.concat([df.pop('day').apply(pd.Series), df], axis=1)
>>> df
     0    1    2    3    4 plcae
0  1.0  2.0  3.0  4.0  5.0     A
1  1.0  2.0  3.0  4.0  NaN     B
2  1.0  2.0  NaN  NaN  NaN     C
3  1.0  2.0  3.0  4.0  NaN     D

Now everything is easier, for example computing the maximum of numeric values along the columns.

>>> df.max(axis=1)
0    5.0
1    4.0
2    2.0
3    4.0
dtype: float64

edit: renaming the index might also be useful to you.

>>> df.max(axis=1).rename(df['plcae'])
A    5.0
B    4.0
C    2.0
D    4.0
dtype: float64

Upvotes: 0

2Obe
2Obe

Reputation: 3720

Try a combination of pd.concat() and df.apply() with:

import numpy as np
import pandas as pd


fly_frame = pd.DataFrame({'day':[[1,2,3,4,5],[1,2,3,4],[1,2],[1,2,3,4]],'place':['A','B','C','D']})

df = pd.concat([fly_frame['day'].apply(max),fly_frame.drop('day',axis=1)],axis=1)

print(df)



   day place
0    5     A
1    4     B
2    2     C
3    4     D

Edit You can also use df.join() with:

fly_frame.drop('day',axis=1).join(fly_frame['day'].apply(np.max,axis=0))


place  day
0     A    5
1     B    4
2     C    2
3     D    4

Upvotes: 0

jezrael
jezrael

Reputation: 862801

Use apply with max:

#if strings

#import ast

#print (type(df.loc[0, 'day']))
#<class 'str'>

#df['day'] = df['day'].apply(ast.literal_eval)

print (type(df.loc[0, 'day']))
<class 'list'>

df['day'] = df['day'].apply(max)

Or list comprehension:

df['day'] = [max(x) for x in df['day']]

print (df)
   day plcae
0    5     A
1    4     B
2    2     C
3    4     D

Upvotes: 5

DYZ
DYZ

Reputation: 57033

df.day.apply(max)
#0    5
#1    4
#2    2
#3    4

Upvotes: 11

Related Questions