tajihiro
tajihiro

Reputation: 2443

How to extract max length row with pandas

I would like to extract row which is max in Dataframe.

In following case, I would like to get id 2 row, because it includes max length 6 in B column bbbbbb.

|id|A   |B     |
|1 |abc |aaa   |
|2 |abb |bbbbbb|
|3 |aadd|cccc  |
|4 |aadc|ddddd |


|id|A   |B     |
|2 |abb |bbbbbb|

Please give me some advice. Thanks.

Upvotes: 1

Views: 3377

Answers (3)

Mykola Zotko
Mykola Zotko

Reputation: 17864

First, you can find the maximal length per each row and then the row index with a maximal value:

df.loc[df[['A', 'B']].apply(lambda x: x.str.len().max(), axis=1).idxmax()]

Upvotes: 0

jezrael
jezrael

Reputation: 863166

Get all columns filled by object (obviously strings) by DataFrame.select_dtypes, get length with max per rows and last filter maximal by boolean indexing for match all rows with maximal lengths:

s = df.select_dtypes(object).apply(lambda x: x.str.len()).max(axis=1)
#if no  missing values
#s = df.select_dtypes(object).applymap(len).max(axis=1)
df1 = df[s.eq(s.max())]
print (df1)
   id    A       B
1   2  abb  bbbbbb

Another idea for only first match by Series.idxmax and DataFrame.loc, added [] for one row DataFrame:

df1 = df.loc[[df.select_dtypes(object).apply(lambda x: x.str.len()).max(axis=1).idxmax()]]
#if no missing values
#df1 = df.loc[[df.select_dtypes(object).applymap(len).max(axis=1).idxmax()]]


print (df1)
   id    A       B
1   2  abb  bbbbbb

Upvotes: 1

villoro
villoro

Reputation: 1549

Let's first create the DataFrame with you example:

import pandas as pd

data = {
    "id": {0: 1, 1: 2, 2: 3, 3: 4},
    "A ": {0: "abc", 1: "abb", 2: "aadd", 3: "aadc"},
    "B": {0: "aaa", 1: "bbbbbb", 2: "cccc", 3: "ddddd"}
}
df = pd.DataFrame(data)

Then you can get the row where B is longer and then retrive that row with:

# Index where B is longest
idx = df["B"].apply(len).idxmax()

# Get that row
df.iloc[idx, :]

Upvotes: 1

Related Questions