feature sky
feature sky

Reputation: 171

Modify a column in first n rows based on other column value in a DataFrame

I want to modify a column in first n rows based on other column value in a DataFrame. Like this:

df.loc[(df.A == i), 'B'][0:10] = 100

It did not work.
Things I have also tried is sampling first n rows like this:

(df.sample(10)).loc[(df.A == i), 'B'] = 100

But It returned ValueError: cannot reindex from a duplicate axis

Upvotes: 3

Views: 2479

Answers (2)

Scott Boston
Scott Boston

Reputation: 153460

You can use head and loc like this:

import pandas as pd
import numpy as np

df = pd.DataFrame({'A':np.arange(100),'B':[1]*100})

df.loc[df[(df.A % 2 == 0)].head(10).index,'B'] = 100

print(df.head(25))

Output:

     A    B
0    0  100
1    1    1
2    2  100
3    3    1
4    4  100
5    5    1
6    6  100
7    7    1
8    8  100
9    9    1
10  10  100
11  11    1
12  12  100
13  13    1
14  14  100
15  15    1
16  16  100
17  17    1
18  18  100
19  19    1
20  20    1
21  21    1
22  22    1
23  23    1
24  24    1

Upvotes: 3

BENY
BENY

Reputation: 323266

I can only come up with this

df.loc[(df.A==i)&(df.index.isin(df.iloc[:10,:].index)),'B']=100

For the sample , this will work

s=(df.sample(10))
s.loc[(df.A == i), 'B'] = 100

And base on discussion on github

You should NEVER do this type of chained inplace setting. It is simply bad practice.

PS : (df.sample(10)).loc[(df.A == i), 'B'] = 100 #this is chained inplace setting

Upvotes: 2

Related Questions