ah bon
ah bon

Reputation: 10011

add specific value to rows selected based on conditions with Pandas

If I have a following dataframe:

id        fruits
01     Apple, Apricot
02     Apple, Banana, Clementine, Pear
03     Orange, Pineapple, Pear

I want to add Fruit to rows where Apple exists to generate a new dataframe like this:

id        fruits
01     Apple, Apricot, Fruit
02     Apple, Banana, Clementine, Pear, Fruit
03     Orange, Pineapple, Pear

How should i do it? Thanks. Sorry i makeup this example to represent my real problem.

Upvotes: 1

Views: 67

Answers (2)

JISHAD A.V
JISHAD A.V

Reputation: 401

df['fruits'] = [row + ', Fruit' if 'Apple' in str(row) else row for row in df['fruits']]

Upvotes: 1

piRSquared
piRSquared

Reputation: 294218

First Hack That Worked

fruit = np.array(', Fruit', object)
df.fruits + df.fruits.str.contains('Apple') * fruit

0                     Apple, Apricot, Fruit
1    Apple, Banana, Clementine, Pear, Fruit
2                   Orange, Pineapple, Pear
Name: fruits, dtype: object

More reasonable

df.loc[df.fruits.str.contains('Apple'), 'fruits'] += ', Fruit'
df

   id                                  fruits
0   1                   Apple, Apricot, Fruit
1   2  Apple, Banana, Clementine, Pear, Fruit
2   3                 Orange, Pineapple, Pear

__

To address comment, NA comes up where elements in the fruits column were not strings. That implies poor data. No matter, we can fill the NAs

Thanks jezrael for improved implementation.

df.loc[df.fruits.str.contains('Apple', na=False), 'fruits'] += ', Fruit'
df

Upvotes: 2

Related Questions