Reputation: 2033
I have a big time series dataset in which the numeric results are stored in General
format in MS-Excel
. I tried using gsub(",", "", dummy )
, but it did not work. The dataset does not have any ,
or any other visible special character other than a decimal point, and R
picks up the datatype
as character
. Values are either positive
or negative
with one NA
and all values have different number of decimal places.
How can I convert without having to deal with N/As
after converting to numeric. One thing to note though is that when converted to numeric, some of the values are displayed in scientific notation like 12.1 e+03
and other values with four decimal places.
dummy = c("12.1", "42000", "1.2145", "12.25", N/A, "323.369", "-1.235", "335", "0")
# Convert to numeric
dummy = gsub(",", "", dummy )
dummy = as.numeric(dummy )
Error
Warning message:
NAs introduced by coercion "
Upvotes: 1
Views: 1948
Reputation: 403
Changing N/A
to NA
solves this issue:
# N/A to NA
dummy = c("12.1", "42000", "1.2145", "12.25", NA, "323.369", "-1.235", "335")
# Convert to numeric
dummy = gsub(",", "", dummy)
dummy = as.numeric(dummy)
To do so for your entire dataset, you can use:
# Across columns (for matrices)
data <- apply(data, 2, function(x){
ifelse(x == "N/A", NA, x)
})
# Then convert characters to numeric (for matrices)
data <- apply(data, 2, as.numeric)
# Across columns (for data frames)
data <- lapply(data, function(x){
ifelse(x == "N/A", NA, x)
})
# Then convert characters to numeric (for data frames)
data <- lapply(data, as.numeric)
Update: *apply
differences for object types in R -- thanks to user20650 for pointing this out
Upvotes: 2