Bristle
Bristle

Reputation: 77

Subset (list of lists) nested Lists

I am trying to subset thead/tbody without directly calling rowlist$td$list$item$table$thead or rowlist[[td]][[list]][[item]][[table]][[thead]]. This unlist(rowlist, use.names=FALSE )[ grepl( "tbody", names(unlist(rowlist)))] serves my purpose except I need it as multiple rows (e.g. two tr's in tbody)(i can split it but seems counter intuitive . I know there should be a better way to work with HTML/XML but this is got I got for now.

str(rowlist)
List of 1
 $ td:List of 1
  ..$ list:List of 1
  .. ..$ item:List of 1
  .. .. ..$ table:List of 2
  .. .. .. ..$ thead:List of 1
  .. .. .. .. ..$ tr:List of 7
  .. .. .. .. .. ..$ th:List of 1
  .. .. .. .. .. .. ..$ : chr "Test"
  .. .. .. .. .. ..$ th:List of 1
  .. .. .. .. .. .. ..$ : chr "Outcome"
  .. .. .. .. .. ..$ th:List of 1
  .. .. .. .. .. .. ..$ : chr "Subset"
  .. .. .. .. .. ..$ th:List of 1
  .. .. .. .. .. .. ..$ : chr "Cups"
  .. .. .. .. .. ..$ th:List of 1
  .. .. .. .. .. .. ..$ : chr "Bowls"
  .. .. .. .. .. ..$ th:List of 1
  .. .. .. .. .. .. ..$ : chr "Plates"
  .. .. .. .. .. ..$ th:List of 1
  .. .. .. .. .. .. ..$ : chr "Jars"
  .. .. .. ..$ tbody:List of 2
  .. .. .. .. ..$ tr:List of 7
  .. .. .. .. .. ..$ td:List of 1
  .. .. .. .. .. .. ..$ : chr "test1"
  .. .. .. .. .. ..$ td:List of 1
  .. .. .. .. .. .. ..$ : chr "High"
  .. .. .. .. .. ..$ td:List of 1
  .. .. .. .. .. .. ..$ : chr "Low"
  .. .. .. .. .. ..$ td:List of 1
  .. .. .. .. .. .. ..$ : chr "Gold"
  .. .. .. .. .. ..$ td:List of 1
  .. .. .. .. .. .. ..$ : chr "Blue"
  .. .. .. .. .. ..$ td:List of 1
  .. .. .. .. .. .. ..$ : chr "Green"
  .. .. .. .. .. ..$ td:List of 1
  .. .. .. .. .. .. ..$ : chr "red"
  .. .. .. .. .. ..- attr(*, "ID")= chr "id_511"
  .. .. .. .. ..$ tr:List of 7
  .. .. .. .. .. ..$ td:List of 1
  .. .. .. .. .. .. ..$ : chr "test2"
  .. .. .. .. .. ..$ td:List of 1
  .. .. .. .. .. .. ..$ : chr "Low"
  .. .. .. .. .. ..$ td:List of 1
  .. .. .. .. .. .. ..$ : chr "High"
  .. .. .. .. .. ..$ td:List of 1
  .. .. .. .. .. .. ..$ : chr "Pink"
  .. .. .. .. .. ..$ td:List of 1
  .. .. .. .. .. .. ..$ : chr "Blue"
  .. .. .. .. .. ..$ td:List of 1
  .. .. .. .. .. .. ..$ : chr "Purple"
  .. .. .. .. .. ..$ td: list()
  .. .. .. .. .. ..- attr(*, "ID")= chr "id_512"
  .. ..- attr(*, "styleCode")= chr "none"

List looks like this

rowlist<-list(td = structure(list(list = structure(list(item = list(table = list(
  thead = list(tr = list(
    th = list("Test"), th = list("Outcome"), th = list("Set"), th = list("Cups"), th = list("Bowls"), th = list( "Plates"), th = list("Jars"))), 
  tbody = list(tr = structure(
    list(td = list("test1"), td = list("High"), td = list("Low"), td = list("Gold"), td = list("Blue"), td = list("Green"), td = list("Red")), ID = "id_511"), 
    tr = structure(
      list(td = list("test2"), td = list("Low"), td = list("High"), td = list("Pink"), td = list("Blue"), td = list("Purple"), td = list()), ID = "id_512"))))), styleCode = "none")), colspan = "20"))

Upvotes: 2

Views: 143

Answers (1)

Joris C.
Joris C.

Reputation: 6234

If the object has to be handled as a nested list, one approach is to use rrapply in the rrapply-package (extension of base rapply):

library(rrapply)  ## v1.2.1

out <- rrapply(rowlist, 
        classes = "list",
        condition = function(x, .xname) .xname %in% c("thead", "tbody"), 
        how = "flatten")

str(out, list.len = 2)
#> List of 2
#>  $ thead:List of 1
#>   ..$ tr:List of 7
#>   .. ..$ th:List of 1
#>   .. .. ..$ : chr "Test"
#>   .. ..$ th:List of 1
#>   .. .. ..$ : chr "Outcome"
#>   .. .. [list output truncated]
#>  $ tbody:List of 2
#>   ..$ tr:List of 7
#>   .. ..$ td:List of 1
#>   .. .. ..$ : chr "test1"
#>   .. ..$ td:List of 1
#>   .. .. ..$ : chr "High"
#>   .. .. [list output truncated]
#>   .. ..- attr(*, "ID")= chr "id_511"
#>   ..$ tr:List of 7
#>   .. ..$ td:List of 1
#>   .. .. ..$ : chr "test2"
#>   .. ..$ td:List of 1
#>   .. .. ..$ : chr "Low"
#>   .. .. [list output truncated]
#>   .. ..- attr(*, "ID")= chr "id_512"

Here, the condition function returns only nodes with names thead or tbody, how = "flatten" returns the nodes in a flat list (how = "prune" would prune the nodes keeping the original list structure), and classes = "list" does not skip intermediate list nodes (as would be the case with base rapply()).

Upvotes: 1

Related Questions