Michael Whitaker
Michael Whitaker

Reputation: 121

Ragged list or data frame to JSON

I am trying to create a ragged list in R that corresponds to the D3 tree structure of flare.json. My data is in a data.frame:

path <- data.frame(P1=c("direct","direct","organic","direct"),
P2=c("direct","direct","end","end"),
P3=c("direct","organic","",""),
P4=c("end","end","",""), size=c(5,12,23,45))

path
       P1     P2      P3  P4 size
1  direct direct  direct end    5
2  direct direct organic end   12
3 organic    end               23
4  direct    end               45

but it could also be a list or reshaped if necessary:

path <- list()
path[[1]] <- list(name=c("direct","direct","direct","end"),size=5)
path[[2]] <- list(name=c("direct","direct","organic","end"), size=12)
path[[3]] <- list(name=c("organic", "end"), size=23)
path[[4]] <- list(name=c("direct", "end"), size=45)

The desired output is:

rl <- list()
rl <- list(name="root", children=list())
rl$children[1] <- list(list(name="direct", children=list()))
rl$children[[1]]$children[1] <- list(list(name="direct", children=list()))
rl$children[[1]]$children[[1]]$children[1] <- list(list(name="direct", children=list()))
rl$children[[1]]$children[[1]]$children[[1]]$children[1] <- list(list(name="end", size=5))

rl$children[[1]]$children[[1]]$children[2] <- list(list(name="organic", children=list()))
rl$children[[1]]$children[[1]]$children[[2]]$children[1] <- list(list(name="end",    size=12))

rl$children[[1]]$children[2] <- list(list(name="end", size=23))

rl$children[2] = list(list(name="organic", children=list()))
rl$children[[2]]$children[1] <- list(list(name="end", size=45))

So when I print to json it's:

require(RJSONIO)
cat(toJSON(rl, pretty=T))

 {
"name" : "root",
"children" : [
    {
        "name" : "direct",
        "children" : [
            {
                "name" : "direct",
                "children" : [
                    {
                        "name" : "direct",
                        "children" : [
                            {
                                "name" : "end",
                                "size" : 5
                            }
                        ]
                    },
                    {
                        "name" : "organic",
                        "children" : [
                            {
                                "name" : "end",
                                "size" : 12
                            }
                        ]
                    }
                ]
            },
            {
                "name" : "end",
                "size" : 23
            }
        ]
    },
    {
        "name" : "organic",
        "children" : [
            {
                "name" : "end",
                "size" : 45
            }
        ]
    }
]
}

I am having a hard time wrapping my head around the recursive steps that are necessary to create this list structure in R. In JS I can pretty easily move around the nodes and at each node determine whether to add a new node or keep moving down the tree by using push as needed, eg: new = {"name": node, "children": []}; or new = {"name": node, "size": size}; as in this example. I tried to split the data.frame as in this example:

 makeList<-function(x){
   if(ncol(x)>2){
      listSplit<-split(x,x[1],drop=T)
      lapply(names(listSplit),function(y){list(name=y,children=makeList(listSplit[[y]]))})
   } else {
      lapply(seq(nrow(x[1])),function(y){list(name=x[,1][y],size=x[,2][y])})
   }
 }

 jsonOut<-toJSON(list(name="root",children=makeList(path)))

but it gives me an error

 Error: evaluation nested too deeply: infinite recursion / options(expressions=)?
 Error during wrapup: evaluation nested too deeply: infinite recursion / options(expressions=)?

Upvotes: 0

Views: 766

Answers (2)

AmeliaBR
AmeliaBR

Reputation: 27544

The function given in the linked Q&A is essentially what you need, however it was failing on your data set because of the null values for some rows in the later columns. Instead of just blindly repeating the recursion until you run out of columns, you need to check for your "end" value, and use that to switch to making leaves:

makeList<-function(x){
    listSplit<-split(x[-1],x[1], drop=TRUE);
    lapply(names(listSplit),function(y){
        if (y == "end") { 
            l <- list();
            rows = listSplit[[y]];
            for(i in 1:nrow(rows) ) {
               l <- c(l, list(name=y, size=rows[i,"size"] ) );
            }
            l;

       }
        else {
             list(name=y,children=makeList(listSplit[[y]]))
        }
    });
}

Upvotes: 1

BrodieG
BrodieG

Reputation: 52637

I believe this does what you want, though it has some limitations. In particular, it is assumed that every branch in your network is unique (i.e. there can't be two rows in your data frame that are equal for every column other than size):

df.split <- function(p.df) {
  p.lst.tmp <- unname(split(p.df, p.df[, 1]))
  p.lst <- lapply(
    p.lst.tmp, 
    function(x) {
      if(ncol(x) == 2L && nrow(x) == 1L) {
        return(list(name=x[1, 1], size=unname(x[, 2])))
      } else if (isTRUE(is.na(unname(x[ ,2])))) {
        return(list(name=x[1, 1], size=unname(x[, ncol(x)])))
      }
      list(name=x[1, 1], children=df.split(x[, -1, drop=F]))
    }
  )
  p.lst
}
all.equal(rl, df.split(path)[[1]])
# [1] TRUE

Though note you had the organic size switched, so I had to fix your rl to get this result (rl has it as 45, but your path as 23). Also, I modified your path data.frame slightly:

path <- data.frame(
  root=rep("root", 4),
  P1=c("direct","direct","organic","direct"),
  P2=c("direct","direct","end","end"),
  P3=c("direct","organic",NA,NA),
  P4=c("end","end",NA,NA), 
  size=c(5,12,23,45), 
  stringsAsFactors=F
)

WARNING: I haven't tested this with other structures, so it's possible it will hit corner cases that you'll need to debug.

Upvotes: 0

Related Questions