Behrouz63
Behrouz63

Reputation: 29

how to convert a column to numeric while it contains both strings and numbers as strings

I have a data frame in which there is column that I want to use to join with another data frame. The column contains number as string and strings such as follows:

x<-data.frame(referenceNumber=c("80937828","gdy","12267133","72679267","72479267"))

How Can I convert the numbers as string to numeric and replace the strings with zeros/null?

I tried x %>% mutate_if(is.character,as.numeric)

But it returns the following error :

"Error in UseMethod("tbl_vars") : 
  no applicable method for 'tbl_vars' applied to an object of class "character""

Upvotes: 1

Views: 2511

Answers (3)

David Z
David Z

Reputation: 7041

Probably due the referenceNumber is factor:

x<-data.frame(referenceNumber=c("80937828","gdy","12267133","72679267","72479267"), stringsAsFactors=F)
str(x)
#'data.frame':   5 obs. of  1 variable:
# $ referenceNumber: chr  "80937828" "gdy" "12267133" "72679267" ...
xx<-x %>% mutate_if(is.character,as.numeric)
#Warning message:
#In evalq(as.numeric(referenceNumber), <environment>) :
#  NAs introduced by coercion
xx
#  referenceNumber
#1        80937828
#2              NA
#3        12267133
#4        72679267
#5        72479267
str(xx)
#'data.frame':   5 obs. of  1 variable:
# $ referenceNumber: num  80937828 NA 12267133 72679267 72479267

Upvotes: 0

Sven
Sven

Reputation: 1253

I'd check for NAs in an ifelse construction:

x<-data.frame(referenceNumber=c("80937828","gdy","12267133","72679267","72479267"), stringsAsFactors = F)

x$referenceNumber <- ifelse(!is.na(as.numeric(x$referenceNumber)), x$referenceNumber, 0)

This only works if your strings are not factors. Otherwise you need to add as.character first.

Upvotes: 0

Tim Biegeleisen
Tim Biegeleisen

Reputation: 520908

We could try just using as.numeric, which would assign NA to any non numeric entry in the vector. Then, we can selectively replace the NA values with zero:

x <- c("80937828","gdy","12267133","72679267","72479267")
output <- as.numeric(x)
output[is.na(output)] <- 0
output

[1] 80937828        0 12267133 72679267 72479267

Edit based on the comment by @Sotos: If the column/vector is actually factor, then it would have to be cast to character in order for my answer above to work.

Upvotes: 1

Related Questions