Reputation: 244
I have loaded an array in R that contains, in each position, a Python list that was stored as a string, e.g. the first position of my tmp
array is "[(1,2),(3,4),(5,6)]"
.
How can I interpret the data into R to obtain a vector that, in each position, contains the said list?
Upvotes: 0
Views: 40
Reputation: 160647
I think by far the better approach (as @Sirius suggested) is to export from python using a more portable format such as JSON.
Lacking that, if it is always python lists and tuples, then you can gsub
the ()
to []
and parse it as json:
tr <- function(x, from="()", to="[]") {
chrs <- Map(c, strsplit(from, "")[[1]], strsplit(to, "")[[1]])
Reduce(function(txt, chr) gsub(chr[1], chr[2], txt, fixed = TRUE), chrs, init = x)
}
tr("[(1,2),(3,4),(5,6)]")
# [1] "[[1,2],[3,4],[5,6]]"
jsonlite::parse_json(tr("[(1,2),(3,4),(5,6)]"))
# [[1]]
# [[1]][[1]]
# [1] 1
# [[1]][[2]]
# [1] 2
# [[2]]
# [[2]][[1]]
# [1] 3
# [[2]][[2]]
# [1] 4
# [[3]]
# [[3]][[1]]
# [1] 5
# [[3]][[2]]
# [1] 6
The tr
function works with vectors of strings as well:
tr(c("[(1,2),(3,4),(5,6)]", "[(1,2),(3,4),(5,7)]"))
# [1] "[[1,2],[3,4],[5,6]]" "[[1,2],[3,4],[5,7]]"
but to use jsonlite::
for this, you will need stream_in
instead, since technically it'll be ndjson (n
ewline-d
elimited), plus controlling simplification:
vec <- c("[(1,2),(3,4),(5,6)]", "[(1,2),(3,4),(5,7)]")
tr(vec)
# [1] "[[1,2],[3,4],[5,6]]" "[[1,2],[3,4],[5,7]]"
out <- jsonlite::stream_in(textConnection(tr(vec)), simplifyVector = FALSE)
# Imported 2 records. Simplifying...
str(out)
# List of 2
# $ :List of 3
# ..$ :List of 2
# .. ..$ : int 1
# .. ..$ : int 2
# ..$ :List of 2
# .. ..$ : int 3
# .. ..$ : int 4
# ..$ :List of 2
# .. ..$ : int 5
# .. ..$ : int 6
# $ :List of 3
# ..$ :List of 2
# .. ..$ : int 1
# .. ..$ : int 2
# ..$ :List of 2
# .. ..$ : int 3
# .. ..$ : int 4
# ..$ :List of 2
# .. ..$ : int 5
# .. ..$ : int 7
Upvotes: 1
Reputation: 887571
It is a nested list
in R
. An option is reticulate
library(reticulate)
tmp <- "[(1,2),(3,4),(5,6)]"
py_run_string(paste0('out=', tmp))$out
-output
#[[1]]
#[[1]][[1]]
#[1] 1
#[[1]][[2]]
#[1] 2
#[[2]]
#[[2]][[1]]
#[1] 3
#[[2]][[2]]
#[1] 4
#[[3]]
#[[3]][[1]]
#[1] 5
#[[3]][[2]]
#[1] 6
Upvotes: 0