Reputation: 11
My question is similar to the following post/question: Find the length of the longest string in a Pandas DataFrame column
However, I'm wondering how to find the longest string across a dataframe with multiple columns. The solution for the above-mentioned post is fixed for a single column. How would I evaluate all columns in a data frame and find the longest length? Note, the longest item may not be a string. It may be a long decimal.
Upvotes: 0
Views: 9092
Reputation: 25209
You can achieve that by searching for maximum within columns and then finding maximum over the result:
np.random.seed(123)
df = pd.DataFrame({
'c1': ['abc','a','ghjhkkhgjgj'],
'c2': np.random.randint(1,1e9,3)
})
df
c1 c2
0 abc 843828735
1 a 914636142
2 ghjhkkhgjgj 155217279
max(df.astype('str').applymap(lambda x: len(x)).max())
11
In case you want the string itself:
mask = df.astype('str').applymap(lambda x: len(x)) >= max(df.astype('str').applymap(lambda x: len(x)).max())
df[mask]
c1 c2
0 NaN NaN
1 NaN NaN
2 ghjhkkhgjgj NaN
Timing comparison vs EdChum's suggestion
%timeit max(df.astype('str').applymap(lambda x: len(x)).max())
100 loops, best of 3: 2.11 ms per loop
%timeit df.astype(str).apply(lambda x: x.str.len()).max().max()
100 loops, best of 3: 2.71 ms per loop
(please take into account, this is still a small df)
Upvotes: 1