Gichin
Gichin

Reputation: 11

Length of max string in pandas dataframe

My question is similar to the following post/question: Find the length of the longest string in a Pandas DataFrame column

However, I'm wondering how to find the longest string across a dataframe with multiple columns. The solution for the above-mentioned post is fixed for a single column. How would I evaluate all columns in a data frame and find the longest length? Note, the longest item may not be a string. It may be a long decimal.

Upvotes: 0

Views: 9092

Answers (1)

Sergey Bushmanov
Sergey Bushmanov

Reputation: 25209

You can achieve that by searching for maximum within columns and then finding maximum over the result:

np.random.seed(123)
df = pd.DataFrame({
        'c1': ['abc','a','ghjhkkhgjgj'],
        'c2': np.random.randint(1,1e9,3)
    })
df
      c1    c2
0   abc 843828735
1   a   914636142
2   ghjhkkhgjgj 155217279

max(df.astype('str').applymap(lambda x: len(x)).max())
11

In case you want the string itself:

mask = df.astype('str').applymap(lambda x: len(x)) >= max(df.astype('str').applymap(lambda x: len(x)).max())
df[mask]

     c1 c2
0   NaN NaN
1   NaN NaN
2   ghjhkkhgjgj NaN

Timing comparison vs EdChum's suggestion

%timeit max(df.astype('str').applymap(lambda x: len(x)).max())
100 loops, best of 3: 2.11 ms per loop

%timeit df.astype(str).apply(lambda x: x.str.len()).max().max()
100 loops, best of 3: 2.71 ms per loop

(please take into account, this is still a small df)

Upvotes: 1

Related Questions