S3AN556
S3AN556

Reputation: 79

R recursively parse parent list attributes

Hello I'm having a hard time trying to return a list with all the nested attributes of the below list structure. I believe I need to recursively apply a function until I reach the bottom of each list.

Data

identifier = list(
  "000000001"
)

attr(identifier, ",scheme") <- "http://example.com"

member <- list(
  "A Member"
)

attr(member, ",dimension") <- "MemberAxis"

segment <- list(
  Imember = member,
  Imember = member,
  Imember = member
)

attr(segment, "$names") <- c("member", "member", "member")

node_start = list(
  identifier = identifier,
  segment = segment
)

attr(node_start, "$names") <- c("identifier","member")


# the list object I'd like to parse all nested attributes from
xml_list <- list(
  node_start = xml_list
)

# I usually start by ignoring the first node and assigning node_start
# first node has no attributes and is part of a list of similarly nested lists
node_start <- xml_list[[1]]

This data is XML data that is converted to a list using xml2::as_list() after it is read. From there the number of nested nodes can change depending on the file being read which is why I believe I need to apply a recursive function.

I've tried rapply which reaches the bottom node but cannot look back at the parent node's attributes

Rapply

node_recursion <- rapply(node_start, attributes, how = "list", classes = "list")
node_recursion <- rapply(node_start, function(y){
  message(y)
  attributes(y)[[1]]
}, how = "list", classes = "list")

IF list keep applying a function

I've tried a number of different loops but I'm having a hard time making my loops recursive.

node_recursion <- function(node_start){
  n <- 1
  l <- length(node_start)
  
  for (i in 1:l) {
    if(is.list(node_start[[l]])){
      attributes(node_start[[l]])
      n <- n + 1
      message("added n")
    }else{
      message("hit return")
      return(node_recursion(node_start[[l]][[n]])) 
    } 
  }
}

node_recursion(node_start)
# contains another layer of data not available above
node_recursion <- function(node, l){
  
  for (i in 1:l) {
    
    s <- node_attribute[[1]][[x]][[1]][[i]]
    z <- 1
    while (is.list(s)) {
      tryCatch(
        expr = {
          
          n <- attributes(s[[z]]) %>% as.data.frame()
          z <- z + 1
          
        }, error = function(e){
          
          message(paste("layer", z, "break"))
          break
          
        })
    }
    
  }
  
}

Hard to find good examples to work off

A lot of the recursive function documentation I can find on google are just basic factorial examples that I'm having a hard time rewriting for my use case.

Desired Output

Ideally I'd just like to unlist the entire node while keeping attribute names.

list(
  scheme = "http://example.com",
  dimension = "MemberAxis",
  dimension = "MemberAxis",
  ...etc
)

My Solution

It works, not sure how else to go about doing this. It provides all attributes with the attribute name.

attributes_list <- list()
z <- 1
for (i in node_start) {
  nested_attributes <- lapply(node_start, attributes)
  node_start <- unlist(node_start, recursive = FALSE)
  attributes_list[[z]] <- nested_attributes
  z <- z + 1
}

Upvotes: 1

Views: 105

Answers (1)

Onyambu
Onyambu

Reputation: 79228

One way you could solve this:

fun <- function(x){
  if(!is.list(x[[1]])) attributes(x)
  else purrr::flatten(lapply(x, Recall))
}

funn(xml_list)
$`,scheme`
[1] "http://example.com"

$`,dimension`
[1] "MemberAxis"

$`,dimension`
[1] "MemberAxis"

$`,dimension`
[1] "MemberAxis"

Note that if you do not want , in the names then you should consider removing them from your list. ie Note that you created the xml_list by having attr(name, ',att_name')

Upvotes: 1

Related Questions