Reputation: 331
I am given a string which was a list of numbers:
s <- "[14,7,5,3,4,0,1,7,2,3,1,18,13,4,23,7,8,8,11,18,15,6,2,10,2,4,8,5,11,5,1,5,2,4,3,1,6,8,5,5,3,1,1,4,5,2,9,3,4,11,11,14,3,12,2,6,0,0,15,1,18,5,3,6,6,6]"
Please guide me how to convert it back to regular list
of numbers?
I have tried using strsplit
, as.data.frame
but it seems very long.
I want something efficient and creative.
Upvotes: 3
Views: 159
Reputation: 1051
Here's the R base solution
This line will only extract digits and save it into a list.
numbers <- regmatches(s, gregexpr("[[:digit:]]+", s))
Unlisting the list and converting it into numeric.
numbers <- as.numeric(unlist(numbers))
Result
[1] 14 7 5 3 4 0 1 7 2 3 1 18 13 4 23 7 8 8 11 18 15 6 2 10 2 4 8 5 11 5 1 5 2 4
[35] 3 1 6 8 5 5 3 1 1 4 5 2 9 3 4 11 11 14 3 12 2 6 0 0 15 1 18 5 3 6 6 6
Upvotes: 1
Reputation: 269396
1) JSON The input shown in the question is in JSON format so use either the jsonlite or rjson package and it will do the needed string processing for you.
library(jsonlite)
fromJSON(s)
giving:
[1] 14 7 5 3 4 0 1 7 2 3 1 18 13 4 23 7 8 8 11 18 15 6 2 10 2
[26] 4 8 5 11 5 1 5 2 4 3 1 6 8 5 5 3 1 1 4 5 2 9 3 4 11
[51] 11 14 3 12 2 6 0 0 15 1 18 5 3 6 6 6
2) strapply If you did want to use string processing anyways then one option would be strapply
extracting all sequences of digits "\\d+"
and converting to numeric giving the same output as above.
library(gsubfn)
strapply(s, "\\d+", as.numeric, simplify = c)
3) scan or without any packages or regular expressions:
scan(text = chartr("[]", " ", s), sep = ",", quiet = TRUE)
Upvotes: 3
Reputation: 886938
One option is to extract the numbers from the string using stri_extract
(stringi
package). The output of stri_extract_all
is a list
of vector
with length
1. Usually, if there are multiple elements of 's' (here it is a single string), to convert it to a single vector
, we use unlist
and then wrap with as.integer
. As there is only a single list
element, we can extract that element with [[
library(stringi)
as.integer(stri_extract_all(s, regex = "\\d+")[[1]])
If we split up the code, as mentioned the stri_extract_all
returns a list
of length 1.
stri_extract_all(s, regex = "\\d+")
#[[1]]####
#[1] "14" "7" "5" "3" "4" "0" "1" "7" "2" "3" "1" "18" "13" "4" "23" "7" "8" "8" "11" "18" "15" "6" "2" "10"
#[25] "2" "4" "8" "5" "11" "5" "1" "5" "2" "4" "3" "1" "6" "8" "5" "5" "3" "1" "1" "4" "5" "2" "9" "3"
#[49] "4" "11" "11" "14" "3" "12" "2" "6" "0" "0" "15" "1" "18" "5" "3" "6" "6" "6"
Extract the list
element containing the vector
stri_extract_all(s, regex = "\\d+")[[1]]
#[1] "14" "7" "5" "3" "4" "0" "1" "7" "2" "3" "1" "18" "13" "4" "23" "7" "8" "8" "11" "18" "15" "6" "2" "10"
#[25] "2" "4" "8" "5" "11" "5" "1" "5" "2" "4" "3" "1" "6" "8" "5" "5" "3" "1" "1" "4" "5" "2" "9" "3"
#[49] "4" "11" "11" "14" "3" "12" "2" "6" "0" "0" "15" "1" "18" "5" "3" "6" "6" "6"
and then convert the vector
of character
elements to integer
Upvotes: 1