Reputation: 99
I have a character dataframe column that needs to be adjusted to a numeric.
I have tried substr, and then conversion to numeric, but the substr command does not recognize the quotes. I have also tried gsub, but it does recognize the length of the character and leaves the quotes intact.
The character column contains decimal comma's and quotes like this:
test <- as.data.frame(c("\"2,60\"", "\"1,30\"", "\"850,00\"", "\"1000,00\"", "\"57,25\"", "\"98,00\""))
colnames(test) <- "v1"
I want to change the column to numeric with decimal points like this:
test.num <- as.data.frame(c(2.60, 1.30, 850.00, 1000.00, 57.25, 98.00))
colnames(test.num) <- "v1"
I hope someone can offer some guidance.
Upvotes: 3
Views: 227
Reputation: 79208
Read the data back in using ,
as the decimal operator:
read.table(text = as.character(test$v1), dec=',')
V1
1 2.60
2 1.30
3 850.00
4 1000.00
5 57.25
6 98.00
Upvotes: 3
Reputation: 72758
Using gsub
from base R you could replace all [[:punct:]]
uation with nothing and divide the result by 100
(always two digits after the ,
assumed).
as.numeric(gsub("\\D", "", test$v1)) / 100
# [1] 2.60 1.30 850.00 1000.00 57.25 98.00
If the assumption does not hold, you could replace the ","
with a "."
.
as.numeric(gsub(",", ".", gsub("\"", "", test2$v1)))
# [1] 2.60 1.30 850.00 1000.00 57.25 98.00
test <- structure(list(v1 = structure(c(3L, 1L, 5L, 2L, 4L, 6L), .Label = c("\"1,30\"",
"\"1000,00\"", "\"2,60\"", "\"57,25\"", "\"850,00\"", "\"98,00\""
), class = "factor")), row.names = c(NA, -6L), class = "data.frame")
test2 <- structure(list(v1 = structure(c(3L, 1L, 5L, 2L, 4L, 6L), .Label = c("\"1,3\"",
"\"1000,0\"", "\"2,60\"", "\"57,25\"", "\"850,00\"", "\"98,00\""
), class = "factor")), row.names = c(NA, -6L), class = "data.frame")
Upvotes: 3
Reputation: 33488
Using chartr()
:
as.numeric(chartr('",', ' .', test$v1))
# [1] 2.60 1.30 850.00 1000.00 57.25 98.00
Upvotes: 5
Reputation: 173557
Here's one option using a function from readr
readr::parse_number(as.character(test$v1),locale = locale(decimal_mark = ","))
Upvotes: 4