Reputation: 171
I want to modify a column in first n rows based on other column value in a DataFrame. Like this:
df.loc[(df.A == i), 'B'][0:10] = 100
It did not work.
Things I have also tried is sampling first n rows like this:
(df.sample(10)).loc[(df.A == i), 'B'] = 100
But It returned ValueError: cannot reindex from a duplicate axis
Upvotes: 3
Views: 2479
Reputation: 153460
You can use head
and loc
like this:
import pandas as pd
import numpy as np
df = pd.DataFrame({'A':np.arange(100),'B':[1]*100})
df.loc[df[(df.A % 2 == 0)].head(10).index,'B'] = 100
print(df.head(25))
Output:
A B
0 0 100
1 1 1
2 2 100
3 3 1
4 4 100
5 5 1
6 6 100
7 7 1
8 8 100
9 9 1
10 10 100
11 11 1
12 12 100
13 13 1
14 14 100
15 15 1
16 16 100
17 17 1
18 18 100
19 19 1
20 20 1
21 21 1
22 22 1
23 23 1
24 24 1
Upvotes: 3
Reputation: 323266
I can only come up with this
df.loc[(df.A==i)&(df.index.isin(df.iloc[:10,:].index)),'B']=100
For the sample
, this will work
s=(df.sample(10))
s.loc[(df.A == i), 'B'] = 100
And base on discussion on github
You should NEVER do this type of chained inplace setting. It is simply bad practice.
PS : (df.sample(10)).loc[(df.A == i), 'B'] = 100
#this is chained inplace setting
Upvotes: 2