azizi tamimi
azizi tamimi

Reputation: 75

how to suppress the NA replacement when converting to numerical?

let's say I have the following data:

dput(mydata)
structure(list(a = c("20", "30", "25", ".", ".", ".", ".", ".", 
".", "25", "0", "1"), b = c(1, 1, 1, 0, 0, 0, 0, 0, 0, 1, 1, 
1), c = c(1, 2, 3, 5, 2, 1, 2, 3, 1, 3, 1, 3), d = c(5, 2, 3, 
1, 3, 13, 1, 3, 1, 31, 2, 1)), row.names = c(NA, 12L), class = "data.frame")

Then, I want to apply a code converting all columns from character to numerical.

mydata_convert <- mydata %>% mutate_if(is.character, as.numeric)

The problem is that the all the "." values had been replaced by NAs.

Warning message: NAs introduced by coercion

Could you please advice on how to retain the original values (so that these values not to be confused with the already existing NAs), and suppress the NA replacement when apply this code please:

mydata_convert <- mydata %>% mutate_if(is.character, as.numeric)

Thanks in advance for your help

Upvotes: 1

Views: 181

Answers (2)

Darren Tsai
Darren Tsai

Reputation: 35639

Use ifelse:

mydata %>% 
  mutate_if(is.character, ~ ifelse(. == ".", 0, .) %>% as.numeric)

#      a b c  d
#  1  20 1 1  5
#  2  30 1 2  2
#  3  25 1 3  3
#  4   0 0 5  1
#  5   0 0 2  3
#  6   0 0 1 13
#  7   0 0 2  1
#  8   0 0 3  3
#  9   0 0 1  1
#  10 25 1 3 31
#  11  0 1 1  2
#  12  1 1 3  1

Upvotes: 3

Rui Barradas
Rui Barradas

Reputation: 76673

The following code first substitutes "0.0" for the single dot "." then coerces to numeric.

library(dplyr)

mydata %>% 
  mutate_if(is.character, list(function(x) sub("^\\.$", "0.0", x))) %>%
  mutate_if(is.character, as.numeric)
#    a b c  d
#1  20 1 1  5
#2  30 1 2  2
#3  25 1 3  3
#4   0 0 5  1
#5   0 0 2  3
#6   0 0 1 13
#7   0 0 2  1
#8   0 0 3  3
#9   0 0 1  1
#10 25 1 3 31
#11  0 1 1  2
#12  1 1 3  1

Upvotes: 3

Related Questions