user14289862
user14289862

Reputation:

Removing from pandas dataframe all rows having less than 3 characters

I have this dataframe

Word    Frequency
0   :       79
1   ,       60
2   look    26
3   e       26
4   a       25
... ... ...
95  trump    2
96  election 2
97  step     2
98  day      2
99  university  2

I would like to remove all words having less than 3 characters. I tried as follows:

df['Word']=df['Word'].str.findall('\w{3,}').str.join(' ')

but it does not remove them from my datataset. Can you please tell me how to remove them? My expected output would be:

Word    Frequency

2   look    26

... ... ...
95  trump    2
96  election 2
97  step     2
98  day      2
99  university  2

Upvotes: 2

Views: 2694

Answers (3)

wwnde
wwnde

Reputation: 26676

Please Try

 df[df.Word.str.len()>=3]

Upvotes: 1

BENY
BENY

Reputation: 323226

Try with

df = df[df['Word'].str.len()>=3]

Upvotes: 4

Cameron Riddell
Cameron Riddell

Reputation: 13407

Instead of attempting a regular expression, you can use .str.len() to get the length of each string of your column. Then you can simply filter based on that length for >= 3

Should look like:

df.loc[df["Word"].str.len() >= 3]

Upvotes: 2

Related Questions