Reputation: 35
I have a DataFrame of different time series of different experiments. each participant made several Trials of an experiment.
the data contains columns of some features. and I want to perform some analysis on some feature columns and update the data frame accordingly.
For example, I want to add column as mean of feature 1 of each trial and a column as shifted feature 1 of each trial.
this is what I have:
participant ID | Trial | Feature 1 |
---|---|---|
1 | 1 | 2 |
1 | 1 | 4 |
1 | 2 | 1 |
1 | 2 | 3 |
2 | 1 | 4 |
2 | 1 | 6 |
2 | 2 | 2 |
2 | 2 | 6 |
This is what I want to get:
participant ID | Trial | Feature 1 | mean of feature 1 | feature 1 shifted |
---|---|---|---|---|
1 | 1 | 2 | 3 | na |
1 | 1 | 4 | 3 | 2 |
1 | 2 | 1 | 2 | na |
1 | 2 | 3 | 2 | 1 |
2 | 1 | 4 | 5 | na |
2 | 1 | 6 | 5 | 4 |
2 | 2 | 2 | 4 | na |
2 | 2 | 6 | 4 | 2 |
Upvotes: 0
Views: 27
Reputation: 24314
Try via groupby()
,transform()
and shift()
:
df['mean of feature 1']=df.groupby(['participant ID','Trial'])['Feature 1'].transform('mean')
df['feature 1 shifted']=df.groupby(['mean of feature 1'])['Feature 1'].shift()
output of df
:
participant ID Trial Feature 1 mean of feature 1 feature 1 shifted
0 1 1 2 3 NaN
1 1 1 4 3 2.0
2 1 2 1 2 NaN
3 1 2 3 2 1.0
4 2 1 4 5 NaN
5 2 1 6 5 4.0
6 2 2 2 4 NaN
7 2 2 6 4 2.0
Upvotes: 1