plr
plr

Reputation: 244

Read Python list written in chars into R

I have loaded an array in R that contains, in each position, a Python list that was stored as a string, e.g. the first position of my tmp array is "[(1,2),(3,4),(5,6)]".

How can I interpret the data into R to obtain a vector that, in each position, contains the said list?

Upvotes: 0

Views: 40

Answers (2)

r2evans
r2evans

Reputation: 160647

I think by far the better approach (as @Sirius suggested) is to export from python using a more portable format such as JSON.

Lacking that, if it is always python lists and tuples, then you can gsub the () to [] and parse it as json:

tr <- function(x, from="()", to="[]") {
  chrs <- Map(c, strsplit(from, "")[[1]], strsplit(to, "")[[1]])
  Reduce(function(txt, chr) gsub(chr[1], chr[2], txt, fixed = TRUE), chrs, init = x)
}

tr("[(1,2),(3,4),(5,6)]")
# [1] "[[1,2],[3,4],[5,6]]"

jsonlite::parse_json(tr("[(1,2),(3,4),(5,6)]"))
# [[1]]
# [[1]][[1]]
# [1] 1
# [[1]][[2]]
# [1] 2
# [[2]]
# [[2]][[1]]
# [1] 3
# [[2]][[2]]
# [1] 4
# [[3]]
# [[3]][[1]]
# [1] 5
# [[3]][[2]]
# [1] 6

The tr function works with vectors of strings as well:

tr(c("[(1,2),(3,4),(5,6)]", "[(1,2),(3,4),(5,7)]"))
# [1] "[[1,2],[3,4],[5,6]]" "[[1,2],[3,4],[5,7]]"

but to use jsonlite:: for this, you will need stream_in instead, since technically it'll be ndjson (newline-delimited), plus controlling simplification:

vec <- c("[(1,2),(3,4),(5,6)]", "[(1,2),(3,4),(5,7)]")
tr(vec)
# [1] "[[1,2],[3,4],[5,6]]" "[[1,2],[3,4],[5,7]]"
out <- jsonlite::stream_in(textConnection(tr(vec)), simplifyVector = FALSE)
#  Imported 2 records. Simplifying...
str(out)
# List of 2
#  $ :List of 3
#   ..$ :List of 2
#   .. ..$ : int 1
#   .. ..$ : int 2
#   ..$ :List of 2
#   .. ..$ : int 3
#   .. ..$ : int 4
#   ..$ :List of 2
#   .. ..$ : int 5
#   .. ..$ : int 6
#  $ :List of 3
#   ..$ :List of 2
#   .. ..$ : int 1
#   .. ..$ : int 2
#   ..$ :List of 2
#   .. ..$ : int 3
#   .. ..$ : int 4
#   ..$ :List of 2
#   .. ..$ : int 5
#   .. ..$ : int 7

Upvotes: 1

akrun
akrun

Reputation: 887571

It is a nested list in R. An option is reticulate

library(reticulate)
tmp <- "[(1,2),(3,4),(5,6)]"
py_run_string(paste0('out=', tmp))$out

-output

#[[1]]
#[[1]][[1]]
#[1] 1

#[[1]][[2]]
#[1] 2


#[[2]]
#[[2]][[1]]
#[1] 3

#[[2]][[2]]
#[1] 4


#[[3]]
#[[3]][[1]]
#[1] 5

#[[3]][[2]]
#[1] 6

Upvotes: 0

Related Questions