VRumay
VRumay

Reputation: 123

Replacing only some rows in the same column

Given a column:

name 
Jules
Jules
Jules
Jules
Vince

I need to replace only the top-half of ocurrences of Jules for Quentin

Such as:

name 
Quentin
Quentin
Jules
Jules
Vince

How do I replace only some values in a given column?

To further ellaborate, the location of Jules will never be the same.

I've thought about iterating like this, but it did not work:

countOfJules = df['name'].value_counts()['Jules']
halfLenght = int(countoftbd/2)
listed = df['name'].to_list()
counter = 1

for eachname in listed:
    if eachname == 'Jules' and counter <= halfLenght:
        listed[:] == 'Quentin'
        counter += 1

Upvotes: 0

Views: 29

Answers (2)

bisen2
bisen2

Reputation: 516

The term that is usually used for accessing a subset of an array (or list, dataframe, etc) is slicing. The Pandas documentation has a nice section on slicing as well as other ways of accessing specific members of a dataframe. In your case, it looks like you are selecting based on index in the array, in which case you can use df[start:stop] where start and stop are the indices you want to access between.

Upvotes: 0

Quang Hoang
Quang Hoang

Reputation: 150755

It is rather straightforward:

# where name is Jules
is_jules = df['name'].eq('Jules')

# total `Jules` in `name`
num_jules = is_jules.sum()

# first half `Jules`
first_half = is_jules.cumsum().le(num_jules//2)

df.loc[is_jules & first_half, 'name'] = 'Quentin'

Output:

      name
0  Quentin
1  Quentin
2    Jules
3    Jules
4    Vince

Upvotes: 1

Related Questions