Pandas Replace 1st Result in a DataFrame

Question

Let's say I have a dataframe that looks like this:

df4

df4 = pd.DataFrame({'Q':['apple', 'apple', 'orange', 'Apple', 'orange'], 'R':['a.txt', 'a.txt', 'a.txt', 'b.txt', 'b.txt']})

>>> df4



        Q      R
0   apple  a.txt
1   apple  a.txt
2  orange  a.txt
3   Apple  b.txt
4  orange  b.txt

What I would like to output is this:

            Q      R
0   breakfast  a.txt
1       apple  a.txt
2      orange  a.txt
3   breakfast  b.txt
4      orange  b.txt

In other words, case insensitive, I want to search every row in a dataframe, find the first occurrence of certain words (in this case, that word is apple), and replace it with another word.

Is there a way to do this?

cs95 · Accepted Answer

Here's a vectorised solution with groupby and idxmin:

v = df.Q.str.lower().eq('apple')    
v2 = (~v).cumsum().where(v)
df.loc[v2.groupby(v2).idxmin().values, 'Q'] = 'breakfast'

df
           Q      R
0  breakfast  a.txt
1      apple  a.txt
2     orange  a.txt
3  breakfast  b.txt
4     orange  b.txt

Pandas Replace 1st Result in a DataFrame

Answers (2)

Related Questions