Reputation: 1999
I have created the following dataframe
Group <- c('A','A','A','B','B','B','B','C','C','C')
YearWeek <-c('201401','201401','201401','201401','201401','201401','201401','201401','201401','201401')
Score1 <- c(404,440,395,500,450,476,350,500,600,575)
Group <- c('A','A','A','B','B','B','B','C','C','C','A','A','A','B','B','B','B','C','C','C')
YearWeek <-c('201401','201401','201401','201401','201401','201401','201401','201401','201401','201401','201402','201402','201402','201402','201402','201402','201402','201402','201402','201402')
Score1 <-c(404,440,395,500,450,476,350,500,600,575,460,445,400,508,470,422,368,555,700,634)
employee <- c(1:20)
employ.data <- data.frame(employee, Group, YearWeek, Score1)
I want to calculate the mean of group 'A' (my control group) by each level of 'YearWeek' and subtract it from Score1 for every employee (including the control group employees) according to the same YearWeek and add the result to the dataframe as a new variable 'Difference'
I tried first to obtain the mean for group 'A' (control group employees) but received the following error:
CTRLScore <- as.data.frame(employ.data[, j=list(mean(Score1),by = list(YearWeek,Group,"A"))])
Error in .subset(x, j) : invalid subscript type 'list'
In addition: Warning message:
In `[.data.frame`(employ.data, , j = list(mean(Score1), by = list(YearWeek, :
named arguments other than 'drop' are discouraged
Upvotes: 1
Views: 662
Reputation: 83255
A dplyr
variation on @MrFlick's answer:
# calculating the means
ctrlmeans <- with(subset(employ.data, Group=="A"), tapply(Score1, YearWeek, mean))
# adding the difference to the data.frame
require(dplyr)
employ.data <- employ.data %.%
mutate(Difference = Score1 - ctrlmeans[employ.data$YearWeek])
Upvotes: 0
Reputation: 230
This seems to work for me:
library(reshape)
melted<-melt(employ.data)
casted<-cast(x,formula=Group+YearWeek~variable,subset=variable=="Score1",fun.aggregate=mean)
#Print Out
casted
# Holder variables
addColumn <- NULL
i<-0
for(i in 1:nrow(employ.data))
{
score <- employ.data[i,]$Score1
group<-employ.data[i,]$Group
yearWeek <- employ.data[i,]$YearWeek
sub<-casted[casted$Group %in% group,]
meanScore<-sub[sub$YearWeek %in% yearWeek,]$Score1
addColumn <- c(addColumn,score-meanScore)
}
# Combine
cbind(employ.data,addColumn)
Upvotes: 0
Reputation: 206411
Here's a strategy that I believe will work.
First calculate the mean for group A for each YearWeek
ctrlmeans <- with(subset(employ.data, Group=="A"), tapply(Score1, YearWeek, mean))
That returns a named vector. We can then use the YearWeek column of the data.frame as a look up into that table to subtract off the mean. We can do that with
Difference <- employ.data$Score1-ctrlmeans[employ.data$YearWeek]
and then add that back to the data.frame
employ.data$Difference <- Difference
Upvotes: 2