Run
Run

Reputation: 57286

How to convert certain columns only to numeric?

How can I convert certain columns only in a data frame to numeric?

For instance, I have this data frame:

structure(list(airport = c("EGLL", "EGLL"), xdate = c("2016-07-28", 
"2016-07-31"), ws = c("6", "5"), wd = c("237", "299"), humidity = c("68", 
"55")), .Names = c("airport", "xdate", "ws", "wd", "humidity"
), row.names = 1:2, class = "data.frame")

I just want to convert ws, wd, and humidity to numeric, not airport and xdate.

If I do this:

columns <- sapply(weatherDF, is.character)
weatherDF[, columns] <- lapply(weatherDF[, columns, drop = FALSE], function(x) as.numeric(as.character(x)))

I am converting airport and xdate to numberic, and then I get this warning:

Warning messages:
1: In FUN(X[[i]], ...) : NAs introduced by coercion
2: In FUN(X[[i]], ...) : NAs introduced by coercion

And now my data frame has become:

structure(list(airport = c(NA_real_, NA_real_), xdate = c(NA_real_, 
NA_real_), ws = c(6, 5), wd = c(237, 299), humidity = c(68, 55
)), .Names = c("airport", "xdate", "ws", "wd", "humidity"), row.names = 1:2, class = "data.frame")

Any ideas how I can convert them properly?

Upvotes: 16

Views: 28426

Answers (4)

sbha
sbha

Reputation: 10432

Using dplyr:

library(dplyr)
df %>% 
  mutate_at(vars(ws, wd, humidity), as.numeric)

# A tibble: 2 x 5
airport xdate         ws    wd humidity
  <chr>   <chr>      <dbl> <dbl>    <dbl>
1 EGLL    2016-07-28    6.  237.      68.
2 EGLL    2016-07-31    5.  299.      55.

Upvotes: 16

Batanichek
Batanichek

Reputation: 7871

1) All your columns is character columns <- sapply(weatherDF, is.character)

airport    xdate       ws       wd humidity 
    TRUE     TRUE     TRUE     TRUE     TRUE

2) Why not simply ?

weatherDF[, 3:ncol(weatherDF)] <- lapply(3:ncol(weatherDF), function(x) as.numeric(weatherDF[[x]]))

or

columns <-c("ws", "wd", "humidity")
weatherDF[, columns] <- lapply(columns, function(x) as.numeric(weatherDF[[x]]))

If your dont know which columns is numeric you can try to find it using tryCatch like

weatherDF[,1:ncol(weatherDF)]=lapply(1:ncol(weatherDF),function(x) {
  tryCatch({
    as.numeric(weatherDF[[x]])
    },warning = function(w) {
    weatherDF[[x]]}
        )} )

Upvotes: 17

RHertel
RHertel

Reputation: 23818

num.cols <- c('ws','wd','humidity')
weatherDF[num.cols] <- sapply(weatherDF[num.cols], as.numeric)

Upvotes: 8

Keith Hughitt
Keith Hughitt

Reputation: 4970

The all.is.numeric function from the Hmisc package does a good job determining whether a given column can be cast to numeric.

Using this, you could do:

numeric_cols <- sapply(weatherDF, Hmisc::all.is.numeric)

if (sum(numeric_cols) > 1)  {
    weatherDF[,numeric_cols] <- data.matrix(weatherDF[,numeric_cols])
} else {
    weatherDF[,numeric_cols] <- as.numeric(weatherDF[,numeric_cols])
}

Upvotes: 1

Related Questions