Reputation: 2085
Hello everybody,
library(dplyr)
library(tibble)
mtcars %>%
rownames_to_column("modelle") %>%
mutate_if(~is.numeric(.x) & mean(.x) > 50, ~(.x / 1000))
Warning message:
In mean.default(.x) : argument is not numeric or logical: returning NA
This error seems to be because of the character vector. It works, but it´s still very ugly. Did I do anything wrong and what can be done better in that case?
Thank you!
Upvotes: 2
Views: 171
Reputation: 160447
R does not short-circuit vectorized &
, so this is running both is.numeric
and mean
on all columns. Since your first column (modelle
) is obviously character
, it is failing.
You actually don't need it to be vectorized, however. If you change from the vectorized &
to the binary &&
, R short-circuits it and you get the behavior you want.
mtcars %>%
rownames_to_column("modelle") %>%
mutate_if(~is.numeric(.x) && mean(.x) > 50, ~(.x / 1000)) %>%
head()
# modelle mpg cyl disp hp drat wt qsec vs am gear carb
# 1 Mazda RX4 21.0 6 0.160 0.110 3.90 2.620 16.46 0 1 4 4
# 2 Mazda RX4 Wag 21.0 6 0.160 0.110 3.90 2.875 17.02 0 1 4 4
# 3 Datsun 710 22.8 4 0.108 0.093 3.85 2.320 18.61 1 1 4 1
# 4 Hornet 4 Drive 21.4 6 0.258 0.110 3.08 3.215 19.44 1 0 3 1
# 5 Hornet Sportabout 18.7 8 0.360 0.175 3.15 3.440 17.02 0 0 3 2
# 6 Valiant 18.1 6 0.225 0.105 2.76 3.460 20.22 1 0 3 1
Further demonstration that &
is not short-circuiting.
mymean <- function(x, ...) {
if (is.character(x)) {
message("character?")
return(Inf) # this is certainly not the right thing to do in general ...
} else mean(x, ...)
}
mtcars %>%
rownames_to_column("modelle") %>%
mutate_if(~is.numeric(.x) & mymean(.x) > 50, ~(.x / 1000)) %>%
head()
# character?
# modelle mpg cyl disp hp drat wt qsec vs am gear carb
# 1 Mazda RX4 21.0 6 0.160 0.110 3.90 2.620 16.46 0 1 4 4
# 2 Mazda RX4 Wag 21.0 6 0.160 0.110 3.90 2.875 17.02 0 1 4 4
# 3 Datsun 710 22.8 4 0.108 0.093 3.85 2.320 18.61 1 1 4 1
# 4 Hornet 4 Drive 21.4 6 0.258 0.110 3.08 3.215 19.44 1 0 3 1
# 5 Hornet Sportabout 18.7 8 0.360 0.175 3.15 3.440 17.02 0 0 3 2
# 6 Valiant 18.1 6 0.225 0.105 2.76 3.460 20.22 1 0 3 1
If short-circuiting were taking place, then mymean
would never get to the message. (I don't think this mymean
is a viable replacement here, for a couple of reasons: (1) the use of Inf
was solely to ensure the condition outside of the call to mean
worked, but if an error/warning occurs and a numeric
is expected, then one should typically return NA
or NaN
, not a number ... even if you might not consider Inf
a real usable number. (2) It is addressing a symptom, not the problem. The problem is the absence of short-circuiting in vectorized logical expressions.)
Upvotes: 4
Reputation: 185
You should use "&&" instead of "&". The first is used for scalars and the second, for vectors. In your case, the average is a scalar.
library(dplyr)
library(tibble)
mtcars %>%
rownames_to_column("modelle") %>%
mutate_if(~is.numeric(.x) && mean(.x) > 50, ~(.x / 1000))
Upvotes: 1