newtoCS
newtoCS

Reputation: 113

Find the words in a list, then remove the word and any other trailing words in the column

How do I find the words in the list and remove any other words after the word found?

For example:

remove_words = ['stack', 'over', 'flow']

Input:

0    abc test test stack yxz
1    cde test12 over ste
2    def123 flow test123
3    yup over 4562

Would like to find the words from a list remove_words list in the pandas dataframe column and remove those words and any words after.

Results:

0    abc test test
1    cde test12 
2    def123
3    yup

Upvotes: 2

Views: 229

Answers (4)

Tom Dee
Tom Dee

Reputation: 2674

remove_words = ['stack', 'over', 'flow']
inputline = "abc test test stack yxz"
for word in inputline.split(" "):
    if word in remove_words:
       print(inputline[:test.index(word)])

This will split the string input into a list then finds the index of any words in the remove_words list and slice the rest of the list off. Just need to do a loop to replace the hardcore string for your whole dataset.

Upvotes: 0

jezrael
jezrael

Reputation: 862641

Use split by all joined values by | for regex OR and select first lists by str[0]:

remove_words = ['stack', 'over', 'flow']

#for more general solution with word boundary
pat = r'\b{}\b'.format('|'.join(remove_words))
df['col'] = df['col'].str.split(pat, n=1).str[0]
print (df)
              col
0  abc test test 
1     cde test12 
2         def123 
3            yup 

Upvotes: 2

Mendy Kahan
Mendy Kahan

Reputation: 533

I have not written in pandas dataframe, but the concert should be the same in any language just loop through all the words and use a replace method with an empty string.

Upvotes: 0

Rishabh Mandayam
Rishabh Mandayam

Reputation: 433

The first step would be to check if the input has a value in it, if not, you can just return the entire input

if "stack" or "over" or "flow" not in input: 
    return input

Now for the removing part. I think the best way to do this is to loop through each value in the input array(I am assuming it is an array) and call str_replace

Upvotes: 0

Related Questions