John Doe
John Doe

Reputation: 10223

Get max value of 3 columns from pandas DataFrame?

I've a Pandas DataFrame with 3 columns:

c={'a': [['US']],'b': [['US']], 'c': [['US','BE']]}
df = pd.DataFrame(c, columns = ['a','b','c'])

Now I need the max value of these 3 columns.

I've tried:

df['max_val'] = df[['a','b','c']].max(axis=1)

The result is Nan instead of the expected output: US.
How can I get the max value for these 3 columns? (and what if one of them contains Nan)

Upvotes: 2

Views: 739

Answers (4)

Smit Parmar
Smit Parmar

Reputation: 1

As I can see you have some elements as a list type, So I think the below-mentioned code will work fine.

  1. First, append all value into an array
  2. Then, find the most occurring element from that array.
from scipy.stats import mode
arr = []

for i in df:
    for j in range(len(df[i])):
        for k in range(len(df[i][j])):
            arr.append(df[i][j][k])
            
from collections import Counter

b = Counter(arr)
print(b.most_common())

this will give you an answer as you want.

Upvotes: 0

FarZad
FarZad

Reputation: 463

while your data are lists, you can't use pandas.mode(). because lists objects are unhashable and mode() function won't work.
a solution is converting the elements of your dataframe's row to strings and then use pandas.mode().
check this:

>>> import pandas as pd
>>> c = {'a': [['US','BE']],'b': [['US']], 'c': [['US','BE']]}
>>> df = pd.DataFrame(c, columns = ['a','b','c'])
>>> x = df.iloc[0].apply(lambda x: str(x))
>>> x.mode()
# Answer:
0    ['US', 'BE']
dtype: object
>>> d = {'a': [['US']],'b': [['US']], 'c': [['US','BE']]}
>>> df2 = pd.DataFrame(d, columns = ['a','b','c'])
>>> z =  df.iloc[0].apply(lambda z: str(z))
>>> z.mode()
# Answer:
0    ['US']
dtype: object

Upvotes: 0

wwnde
wwnde

Reputation: 26686

if it as @ Erfan stated, most common value in a row then .agg(), mode

df.agg('mode', axis=1)
         0
0  [US, BE]
1      [US]

Upvotes: 0

jezrael
jezrael

Reputation: 863791

Use:

c={'a': [['US', 'BE'],['US']],'b': [['US'],['US']], 'c': [['US','BE'],['US','BE']]}
df = pd.DataFrame(c, columns = ['a','b','c'])
                  
            
from collections import Counter
df = df[['a','b','c']].apply(lambda x: list(Counter(map(tuple, x)).most_common()[0][0]), 1)
print (df)
0    [US, BE]
1        [US]
dtype: object

Upvotes: 1

Related Questions