Reputation: 1623
how to get the average of two columns of a data table using dplyr? For example, if my data if like below:
dt <- data.table(A=1:5, B=c(1,4,NA,6,8))
I want to create a new column "Avg" which is the mean of column A and B for each row:
dt %>% mutate(Avg=mean(c(A, B), na.rm=T))
But this code does not give me the correct result. How to do this? Thank you very much.
Upvotes: 12
Views: 26585
Reputation: 887951
As the initial dataset is data.table
, we could use data.table
methods
dt[, Avg:= mean(unlist(.SD), na.rm=TRUE) , .1:nrow(dt)]
dt
# A B Avg
#1: 1 1 1.0
#2: 2 4 3.0
#3: 3 NA 3.0
#4: 4 6 5.0
#5: 5 8 6.5
Upvotes: 1
Reputation: 7469
If you want to use dplyr to achieve this, I would suggest using the function rowwise()
:
R> library(dplyr)
R> dt <- data.table(A=1:5, B=c(1,4,NA,6,8))
R> j <- dt %>% rowwise() %>% mutate(Avg=mean(c(A, B), na.rm=T))
R> j
Source: local data frame [5 x 3]
Groups: <by row>
A B Avg
(int) (dbl) (dbl)
1 1 1 1.0
2 2 4 3.0
3 3 NA 3.0
4 4 6 5.0
5 5 8 6.5
Upvotes: 28
Reputation: 206596
How about
dt %>% mutate(Avg=rowMeans(cbind(A, B), na.rm=T))
mean
is not vectorized. It collapse all inputs to a single value. If you make a matrix with cbind()
, you can use rowMeans
to do the trick.
Upvotes: 21