maximusyoda
maximusyoda

Reputation: 615

How can I find mean values in a column based on a different column?

I have this dataframe, and I am trying to find the mean of the index values for same ID and assign NA's for the rest (Or just keep the value like in the intended output).

ID    hot   index
41400 10      2
41400 12      2
41400 75      4
41401 89      5 
41401 25      3 
41401 100     6 
20445 67      4
20445 89      6
20445 4       1
20443 67      5
20443 120.2   7
20443 140.5   8
20423 170.5   10
20423 78.1    5

The intended output is

ID    hot     index mean
41400 10      2      2.67
41400 12      2      2.67
41400 75      4      2.67
41401 89      5      4.67 
41401 25      3      4.67
41401 100     6      4.67
20445 67      4      3.67
20445 89      6      3.67
20445 4       1      3.67
20443 67      5      6.67
20443 120.2   7      6.67
20443 140.5   8      6.67
20423 170.5   10     12.5
20423 78.1    5      12.5

I am not sure how to go about it. Could you please help?

Thanks

Upvotes: 1

Views: 109

Answers (2)

akrun
akrun

Reputation: 887118

Using data.table

 library(data.table)
 setDT(dd)[,meaned:=mean(index), by=ID]

You mentioned in the post assign NA's for the rest, if it is to assign NA's to all the rows except the first for each IDs

 setDT(dd)[,c("meaned", "N") :=  {list(mean(index) , 1:.N)}, by=ID][N!=1, meaned:=NA][,N:=NULL]

    dd
 #       ID   hot index   meaned
 #1: 41400  10.0     2 2.666667
 #2: 41400  12.0     2       NA
 #3: 41400  75.0     4       NA
 #4: 41401  89.0     5 4.666667
 #5: 41401  25.0     3       NA
 #6: 41401 100.0     6       NA
 #7: 20445  67.0     4 3.666667
 #8: 20445  89.0     6       NA
 #9: 20445   4.0     1       NA
#10: 20443  67.0     5 6.666667
#11: 20443 120.2     7       NA
#12: 20443 140.5     8       NA
#13: 20423 170.5    10 7.500000
#14: 20423  78.1     5       NA

Upvotes: 1

MrFlick
MrFlick

Reputation: 206232

Assuming your data.frame is named dd, you can use the ave function which performs functions at a group level. For example

meaned<-with(dd, ave(index, ID, FUN=mean))
cbind(dd, meaned)

Upvotes: 1

Related Questions