XM_Z
XM_Z

Reputation: 41

cannot convert character into numeric in R

I copy and pasted the weather information from the following website "weather underground" for some data analysis and the data looks like below:

https://www.wunderground.com/dashboard/pws/KCACHINO13/table/2018-04-10/2018-04-10/daily

enter image description here

As you can see, the temperature and other information all have the text with it so I cannot conduct any calculation. In the excel, I used substitute(xx,"F","") to remove the F from the "Temperature" column, but then I wanted to convert Farenheit to Celcius using convert(xx,"F","C"), I could not get the outcome. I think there is something wrong with the data itself. I formatted the cell into number or copy and paste the value to a new column, but neither of them worked.

Then I import the data.frame into R and try to do some data formating using R. I checked the class of the Temperature column, which shows "character":

class(a$Temperature)
#"character"

a$Temperature <- gsub("F","",a$Temperature)
# this command remmoved "F"

as.numeric(a$Temperature)
#Warning message: NAs introduced by coercion 

as.numeric(unlist(a$Temperature))
#still the same warning message

From the excel, I created the new column removing "F" from temperature, and used this in R to convert "character" to "numeric", I still got the warning message. I don't know how to deal with this problem. Could someone help me with this? Thank you!

As recommended below, I am pasting the output from

dput(head(a))

#structure(list(Time = structure(c(-2209075140, -2209074840, -2209074540, 
-2209074240, -2209073940, -2209073640), tzone = "UTC", class = c("POSIXct", 
"POSIXt")), Temperature = c("60.0 ", "59.9 ", "59.8 ", "59.7 ", 
"59.6 ", "59.5 "), `T(F)` = c("60.0 ", "59.9 ", "59.8 ", "59.7 ", 
"59.6 ", "59.5 "), `Dew Point` = c("48.2 F", "48.1 F", "48.4 F", 
"48.3 F", "48.2 F", "48.1 F"), Humidity = c("65 %", "65 %", "66 %", 
"66 %", "66 %", "66 %"), Wind = c("WSW", "WSW", "WSW", "WSW", 
"WSW", "WSW"), Speed = c("0.0 mph", "0.0 mph", "0.0 mph", "0.0 mph", 
"0.0 mph", "0.0 mph"), Gust = c("0.0 mph", "0.0 mph", "0.0 mph", 
"0.0 mph", "0.0 mph", "0.0 mph"), Pressure = c("29.88 in", "29.88 in", 
"29.88 in", "29.88 in", "29.88 in", "29.88 in"), `Precip. Rate.` = c("0.00 in", 
"0.00 in", "0.00 in", "0.00 in", "0.00 in", "0.00 in"), `Precip. Accum.` = c("0.00 in", 
"0.00 in", "0.00 in", "0.00 in", "0.00 in", "0.00 in"), UV = c(0, 
0, 0, 0, 0, 0), Solar = c("0 w/m²", "0 w/m²", "0 w/m²", "0 w/m²", 
"0 w/m²", "0 w/m²")), row.names = c(NA, -6L), class = c("tbl_df", 
"tbl", "data.frame"))

Upvotes: 0

Views: 959

Answers (2)

Joel L&#228;nsisalmi
Joel L&#228;nsisalmi

Reputation: 11

Maybe this, helps. In this function there is nested few different functions, such as changing from character variable to numeric. Also gsub, which changes comma to empty space. You should just change comma to letter what you are changing. Never tried if it works with letters, but this might be one solution. Here is the code:

data666 <- apply(data, 2, function(x) as.numeric(as.character(gsub(",", "", x))))

Apply function apllies functions in for whole dataset. 2 means that it does it column by column. If you want to chaange it row by ro you have to change 2 to 1.

Upvotes: 0

Mike V
Mike V

Reputation: 1364

If you want to convert only Temperature column, here is an option you may consider.

Data

df <- structure(list(Time = c("12:04 AM", "12:09 AM", "12:14 AM", "12:19 AM", 
"12:24 AM", "12:29 AM"), Temperature = c("69.4 F", "69.2 F", 
"68.8 F", "68.5 F", "68.3 F", "68.0 F"), Dew.Point = c("45.9 F", 
"46.0 F", "45.8 F", "45.7 F", "45.7 F", "45.7 F"), Humidity = c("43 %", 
"43 %", "44 %", "44 %", "44 %", "45 %"), Wind = c("NE", "NE", 
"NE", "NE", "NE", "NE"), Speed = c("0.0 mph", "0.0 mph", "0.0 mph", 
"0.0 mph", "0.0 mph", "0.0 mph"), Gust = c("0.0 mph", "0.0 mph", 
"0.0 mph", "0.0 mph", "0.0 mph", "0.0 mph"), Pressure = c("29.93 in", 
"29.94 in", "29.94 in", "29.95 in", "29.95 in", "29.95 in"), 
    Precip..Rate. = c("0.00 in", "0.00 in", "0.00 in", "0.00 in", 
    "0.00 in", "0.00 in"), Precip..Accum. = c("0.00 in", "0.00 in", 
    "0.00 in", "0.00 in", "0.00 in", "0.00 in"), UV = c(0L, 0L, 
    0L, 0L, 0L, 0L), Solar = c("0 w/m²", "0 w/m²", "0 w/m²", 
    "0 w/m²", "0 w/m²", "0 w/m²")), class = "data.frame", row.names = c(NA, 
-6L))

Code

library(dplyr)
library(stringr)
df2 <- df %>% 
  mutate(Temperature2 = as.numeric(str_extract(Temperature, "[\\d\\.]+"))) %>% 
  relocate(Temperature2, .after = Temperature)

df2[, 2:3]
#   Temperature Temperature2
# 1      69.4 F         69.4
# 2      69.2 F         69.2
# 3      68.8 F         68.8
# 4      68.5 F         68.5
# 5      68.3 F         68.3
# 6      68.0 F         68.0
str(df2$Temperature2)
# num [1:6] 69.4 69.2 68.8 68.5 68.3 68

Upvotes: 2

Related Questions