Reputation: 715
Given a Pandas DataFrame with lists stored in several of the columns, is there a simple way to find the column name which contains the longest list for each row?
For example, with this data:
positive negative neutral
1 [marvel, moral, bold, destiny] [] [view, should]
2 [beautiful] [complicated, need] []
3 [celebrate] [crippling, addiction] [big]
I want to identify "positive" as the column with the longest list for row 1 and "negative" for rows 2 and 3.
I thought I could use str.len()
to calculate the list lengths and idmax()
to get the column names, but can't figure out how to combine them.
Upvotes: 9
Views: 1404
Reputation: 323326
Or you can try this ...
df=df.reset_index()
DF=pd.melt(df,id_vars=['index'])
DF['Length']=DF['value'].apply(lambda x : len(x))
DF.sort_values(['index','Length']).drop_duplicates(subset=['index'],keep='last')
Upvotes: 2
Reputation: 109636
>>> df.apply(lambda row: row.apply(len).argmax(), axis=1)
0 positive
1 negative
2 negative
dtype: object
Upvotes: 5
Reputation: 210892
IIUC:
In [227]: df.applymap(len).idxmax(axis=1)
Out[227]:
0 positive
1 negative
2 negative
dtype: object
Upvotes: 14