Reputation: 75

Iterate over pandas dataframe and apply condition

Consider I have this dataframe, wherein I want to remove toy as a topic from the topics column and if there is row with a single topic as a toy , remove that row. How can we do that in pandas?

+---+-----------------------------------+-------------------------+
|   |           Comment                 |            Topic        |
+---+-----------------------------------+-------------------------+
| 1 |             -----                 | toy, bottle, vegetable  | 
| 2 |             -----                 | fruit, toy, electronics |  
| 3 |             -----                 | toy                     |  
| 4 |             -----                 | electronics, fruit      |  
| 5 |             -----                 | toy, electronic         |           
+---+-----------------------------------+-------------------------+

Upvotes: 1

Answers (3)

P. Junqueira

Reputation: 1

# create data dummy test data
data = {'comment':[1,2,3,4,5],
        'Topic':['toy, bottle, vegetable','fruit, toy, electronics','toy','electronics, fruit','toy, electronic']}

# create dataframe
df = pd.DataFrame(data)

# create function to tidy your data and remove toy
def remove_toy(row):
    row = [i.strip() for i in row.split(',')]
    row = [i for i in row if i != 'toy']
    return ', '.join(row)

# apply function to series
df['Topic'] = df['Topic'].apply(remove_toy)

#remove empy rows in the Topic series
df = df[df['Topic']!='']

Upvotes: 0

Ujjwal

Reputation: 79

lambda functions can come handy in such scenarios

df['topic'] = df['topic'].apply(lambda x: "" if len(x.split(','))==1 and x.split(',')[0]=='toy'))

Upvotes: 0

U13-Forward

Reputation: 71570

Try using str.replace with str.rstrip and ne inside [...]:

df['topic'] = df['topic'].str.replace('toy', ' ').str.replace(' , ', '').str.rstrip()
print(df[df['topic'].ne('')])

Upvotes: 1

Iterate over pandas dataframe and apply condition

Answers (3)

Related Questions