Reputation: 61154
This should be pretty easy, but the results after using suggestions from other SO posts leave me baffled. And, of course, I'd like to avoid using a For loop
.
Reproducible example
library(stringr)
input <- "<77Â 500 miles</dd>"
mynumbers <- str_extract_all(input, "[0-9]")
The variable mynumbers is a list of five characters:
> mynumbers
[[1]]
[1] "7" "7" "5" "0" "0"
But this is what I'm after:
> mynumbers
[1] 77500
This post suggests using paste()
, and I guess this should work fine given the correct sep
and collapse
arguments, but I have got to be missing something essential here. I have also tried to use unlist()
. Here is what I've tried so far:
1 - using paste()
> paste(mynumbers)
[1] "c(\"7\", \"7\", \"5\", \"0\", \"0\")"
2 - using paste()
> paste(mynumbers, sep = " ")
[1] "c(\"7\", \"7\", \"5\", \"0\", \"0\")"
3 - using paste()
> paste (mynumbers, sep = " ", collapse = NULL)
[1] "c(\"7\", \"7\", \"5\", \"0\", \"0\")"
4 - using paste()
> paste (mynumbers, sep = "", collapse = NULL)
[1] "c(\"7\", \"7\", \"5\", \"0\", \"0\")"
5 - using unlist()
> as.numeric(unlist(mynumbers))
[1] 7 7 5 0 0
I'm hoping some of you have a few suggestions. I guess there's an elegant solution using regex somehow, but I'm also very interested in the paste / unlist problem that is specific to R. Thanks!
Upvotes: 6
Views: 1253
Reputation: 53
An alternative using the stringr library:
str_remove_all(input, pattern = "\\D+") %>% as.numeric()
[1] 77500
Upvotes: 2
Reputation: 887571
The str_extract_all
returns a list
. We need to convert to vector
and then paste
. To extract the list
element we use [[
and as there is only a single element, mynumbers[[1]]
will get the vector
. Then, do the paste/collapse
and as.numeric
.
as.numeric(paste(mynumbers[[1]],collapse=""))
#[1] 77500
We can also match one or more non-numeric (\\D+
), replace it with ""
in gsub
and convert to numeric
.
as.numeric(gsub("\\D+", "", input))
#[1] 77500
Upvotes: 10