Reputation: 123
Given a column:
name
Jules
Jules
Jules
Jules
Vince
I need to replace only the top-half of ocurrences of Jules
for Quentin
Such as:
name
Quentin
Quentin
Jules
Jules
Vince
How do I replace only some values in a given column?
To further ellaborate, the location of Jules
will never be the same.
I've thought about iterating like this, but it did not work:
countOfJules = df['name'].value_counts()['Jules']
halfLenght = int(countoftbd/2)
listed = df['name'].to_list()
counter = 1
for eachname in listed:
if eachname == 'Jules' and counter <= halfLenght:
listed[:] == 'Quentin'
counter += 1
Upvotes: 0
Views: 29
Reputation: 516
The term that is usually used for accessing a subset of an array (or list, dataframe, etc) is slicing. The Pandas documentation has a nice section on slicing as well as other ways of accessing specific members of a dataframe. In your case, it looks like you are selecting based on index in the array, in which case you can use df[start:stop]
where start
and stop
are the indices you want to access between.
Upvotes: 0
Reputation: 150755
It is rather straightforward:
# where name is Jules
is_jules = df['name'].eq('Jules')
# total `Jules` in `name`
num_jules = is_jules.sum()
# first half `Jules`
first_half = is_jules.cumsum().le(num_jules//2)
df.loc[is_jules & first_half, 'name'] = 'Quentin'
Output:
name
0 Quentin
1 Quentin
2 Jules
3 Jules
4 Vince
Upvotes: 1