Reputation: 831
I am getting strange errors with row-wise mutate
in dplyr
. Here is an example:
set.seed(1)
df <- data.frame(a = rnorm(5), b = rnorm(5))
df[2,'b'] <- NA
There is no trouble with sum
, but summary functions are problematic:
mutate(rowwise(df), sum(a, b, na.rm = T)) # works
mutate(rowwise(df), mean(a, b, na.rm = T))
#! Error: missing value where TRUE/FALSE needed
mutate(rowwise(df), median(a, b, na.rm = T))
#! Error: unused argument (-0.820468384118015)
Now, we can try to NA
in the first column:
df <- data.frame(a = rnorm(5), b = rnorm(5))
df[2,'a'] <- NA
mutate(rowwise(df), sum(a, b, na.rm = T)) # works
mutate(rowwise(df), mean(a, b, na.rm = T))
#! no error, but returns `NaN`
mutate(rowwise(df), median(a, b, na.rm = T))
#! Error: unused argument (-0.820468384118015)
I am not sure if I am doing something wrong here. I think the expected behavior should be the same as:
as.data.frame(apply(df, 1, mean, na.rm = T)
Thanks!
Upvotes: 1
Views: 2182
Reputation: 56915
Your error is that you are calling mean
and median
incorrectly.
While sum
can take any number of arguments and will just add them all, mean
and median
take in only ONE x
argument to take the mean/median of.
Just like if a
and b
were vectors and you wanted the mean of the combined vector you'd use mean(c(a, b))
rather than mean(a,b)
, you do the same here:
mutate(rowwise(df), mean=mean(c(a, b), na.rm = T), med=median(c(a, b), na.rm=T))
(side note: you are only calculating the mean and median of 2 values at a time here, so the mean equals the median anyway...)
Upvotes: 5