Reputation: 2155
I'd like to say, if index is greater than 10, then set all values to an average of the last 3 rows before the index of 10 (9,8,7)
So far I have this;
df.loc[df.index>10,columns_list]=df.loc[df.index<10 & df.index>=7,columns_list].values.mean
Upvotes: 2
Views: 9706
Reputation: 4315
Ex.
import pandas as pd
import numpy as np
df = pd.DataFrame({'A':[1,2,3,4,9,7,6,5,12,45,90,2323],'B':[9,8,7,6,5,4,3,2,1,0,12,11]})
print(df)
#mask row index which less than 10
m = df.index<10
# tail return last 3 row
mean = np.mean(df.loc[m,['A','B']].tail(3))
print(mean)
O/P:
Dataframe:
A B
0 1 9
1 2 8
2 3 7
3 4 6
4 9 5
5 7 4
6 6 3
7 5 2
8 12 1
9 45 0
10 90 12
11 2323 11
Mean value:
A 20.666667
B 1.000000
dtype: float64
Upvotes: 0
Reputation: 862781
You are close, need parentheses arounf conditions and axis=0
to numpy mean
:
np.random.seed(123)
df = pd.DataFrame(np.random.randint(10, size=(15, 3)), columns=list('abc'))
cols = ['a','b']
df.loc[df.index>10, cols] = df.loc[(df.index<10) & (df.index>=7), cols].values.mean(axis=0)
print (df)
a b c
0 2.0 2.000000 6
1 1.0 3.000000 9
2 6.0 1.000000 0
3 1.0 9.000000 0
4 0.0 9.000000 3
5 4.0 0.000000 0
6 4.0 1.000000 7
7 3.0 2.000000 4
8 7.0 2.000000 4
9 8.0 0.000000 7
10 9.0 3.000000 4
11 6.0 1.333333 5
12 6.0 1.333333 1
13 6.0 1.333333 5
14 6.0 1.333333 6
Upvotes: 2