Reputation: 75
let's say I have the following data:
dput(mydata)
structure(list(a = c("20", "30", "25", ".", ".", ".", ".", ".",
".", "25", "0", "1"), b = c(1, 1, 1, 0, 0, 0, 0, 0, 0, 1, 1,
1), c = c(1, 2, 3, 5, 2, 1, 2, 3, 1, 3, 1, 3), d = c(5, 2, 3,
1, 3, 13, 1, 3, 1, 31, 2, 1)), row.names = c(NA, 12L), class = "data.frame")
Then, I want to apply a code converting all columns from character to numerical.
mydata_convert <- mydata %>% mutate_if(is.character, as.numeric)
The problem is that the all the "." values had been replaced by NAs.
Warning message: NAs introduced by coercion
Could you please advice on how to retain the original values (so that these values not to be confused with the already existing NAs), and suppress the NA replacement when apply this code please:
mydata_convert <- mydata %>% mutate_if(is.character, as.numeric)
Thanks in advance for your help
Upvotes: 1
Views: 181
Reputation: 35639
Use ifelse
:
mydata %>%
mutate_if(is.character, ~ ifelse(. == ".", 0, .) %>% as.numeric)
# a b c d
# 1 20 1 1 5
# 2 30 1 2 2
# 3 25 1 3 3
# 4 0 0 5 1
# 5 0 0 2 3
# 6 0 0 1 13
# 7 0 0 2 1
# 8 0 0 3 3
# 9 0 0 1 1
# 10 25 1 3 31
# 11 0 1 1 2
# 12 1 1 3 1
Upvotes: 3
Reputation: 76673
The following code first substitutes "0.0"
for the single dot "."
then coerces to numeric.
library(dplyr)
mydata %>%
mutate_if(is.character, list(function(x) sub("^\\.$", "0.0", x))) %>%
mutate_if(is.character, as.numeric)
# a b c d
#1 20 1 1 5
#2 30 1 2 2
#3 25 1 3 3
#4 0 0 5 1
#5 0 0 2 3
#6 0 0 1 13
#7 0 0 2 1
#8 0 0 3 3
#9 0 0 1 1
#10 25 1 3 31
#11 0 1 1 2
#12 1 1 3 1
Upvotes: 3