Syed Md Ismail
Syed Md Ismail

Reputation: 898

limit number of words in a column in a DataFrame

My dataframe looks like

      Abc                       XYZ 
0  Hello   How are you doing today
1   Good                 This is a
2    Bye                   See you
3  Books  Read chapter 1 to 5 only

max_size = 3, I want to truncate the column(XYZ) to a maximum size of 3 words(max_size). There are rows with length less than max_size, and it should be left as it is.

Desired output:

     Abc                       XYZ
0  Hello               How are you
1   Good                 This is a
2    Bye                   See you
3  Books            Read chapter 1

Upvotes: 3

Views: 965

Answers (1)

jezrael
jezrael

Reputation: 862851

Use split with limit, remove last value and then join lists together:

max_size = 3

df['XYZ'] = df['XYZ'].str.split(n=max_size).str[:max_size].str.join(' ')
print (df)
     Abc             XYZ
0  Hello     How are you
1   Good       This is a
2    Bye         See you
3  Books  Read chapter 1

Another solution with lambda function:

df['XYZ'] = df['XYZ'].apply(lambda x: ' '.join(x.split(maxsplit=max_size)[:max_size]))

Upvotes: 7

Related Questions