Reputation: 113
I have this data of 4 columns and 8 rows...
df = pd.DataFrame([[1, 2, 3,7], [2, 8, 6,8],[3, 2, 3,7], [4, 4, 6,8],[5, 2, 3,7], [6, 1, 6,8],[7, 8, 3,7], [8, 9, 6,8]], columns=['time','A', 'B', 'C'])
time A B C
0 1 2 3 7
1 2 8 6 8
2 3 2 3 7
3 4 4 6 8
4 5 2 3 7
5 6 1 6 8
6 7 8 3 7
7 8 9 6 8
I want to take mean and STD of column A and C, across columns not rows. e.g Mean and STD of 2 and 7 is "" (mean) and "3.535533906" (STD) respectively as following.
I want my result to look like this...
Mean STD
0 4.7 3.535533906
1 8 0
2 . .
3 . .
. . .
. . .
However, when I try to do
df= df.loc[(df.time>=2) & (df.time<=7),['A','C']],(['mean','std'])
I get the following error...
AttributeError: 'DataFrame' object has no attribute 'time'
I've tried to find solutions by doing this as well but in vain :
df= df.loc[(df.time>=2) & (df.time<=7),['A','C']].agg(['mean','std'])
but it gives me result of all rows mean and STD.
A C
mean 4.166667 7.500000
std 3.125167 0.547723
How do I fix it?
Upvotes: 1
Views: 130
Reputation: 323246
You can use describe
df[['A','C']].T.describe().T[['mean','std']]
Out[865]:
mean std
0 4.5 3.535534
1 8.0 0.000000
2 4.5 3.535534
3 6.0 2.828427
4 4.5 3.535534
5 4.5 4.949747
6 7.5 0.707107
7 8.5 0.707107
Upvotes: 1
Reputation: 210842
Another way:
In [346]: df[['A','C']].T.agg(['mean','std']).T
Out[346]:
mean std
0 4.5 3.535534
1 8.0 0.000000
2 4.5 3.535534
3 6.0 2.828427
4 4.5 3.535534
5 4.5 4.949747
6 7.5 0.707107
7 8.5 0.707107
or as a new columns in the original DF:
In [347]: df[['Mean','STD']] = df[['A','C']].T.agg(['mean','std']).T
In [348]: df
Out[348]:
time A B C Mean STD
0 1 2 3 7 4.5 3.535534
1 2 8 6 8 8.0 0.000000
2 3 2 3 7 4.5 3.535534
3 4 4 6 8 6.0 2.828427
4 5 2 3 7 4.5 3.535534
5 6 1 6 8 4.5 4.949747
6 7 8 3 7 7.5 0.707107
7 8 9 6 8 8.5 0.707107
Upvotes: 4