Reputation: 2053
I have the following data frame in R which gives me the customers 1,2, and 3's transactional record. Each row shows the period type that the transaction has been made and the amount of money he spent.
id<-c(1,2,3,1,1,2,3,2,2)
period<-c("calib","valid","valid","calib","valid","valid","calib","calib","valid")
spent<-c(10,3,8,12,5,5,4,3,5)
df<-data.frame(id,period,spent)
now I need to create a new data frame which gives me the average 'spent' of each 'id' per transaction in different 'period's. The resulted table which I got in in excel pivot table should be as follow:
id calib valid
1 11 5
2 3 4.33
3 4 8
I know there should be a way to make this work in R but since I am new in R I'm not aware of it. I wonder if anyone can help me with this.
Upvotes: 2
Views: 4231
Reputation: 23004
And using the dplyr and tidyr packages:
df %>%
group_by(id,period) %>%
summarise (avg = mean(spent)) %>%
spread(period,avg) # reshapes dataframe from long to wide
Upvotes: 2
Reputation: 70623
Here's a base solution for completeness.
I edited your code, you forgot to include spent
into your end data.frame.
df <- data.frame(id, period, spent)
with(df, tapply(spent, INDEX = list(id, period), FUN = mean))
calib valid
1 11 5.000000
2 3 4.333333
3 4 8.000000
Upvotes: 3
Reputation: 173517
You can do this using dcast
from the reshape2 package (among numerous other ways, I'm sure):
library(reshape2)
dcast(df,id~period,fun.aggregate = mean)
id calib valid
1 1 11 5.000000
2 2 3 4.333333
3 3 4 8.000000
(Note that I'm assuming you intended to include the spent
vector in your data frame.)
Upvotes: 3