Reputation: 726
I have a list of lists (qlist), with lists within qlist of different length (see example - tags list), and I'd like to convert selected elements (tags and question_id, skip creation_date) of it to a data.frame where tags are 1 column with the second column of a corresponding question_id.
qlist <- list()
qlist[[1]] <- list(tags = list( "r", "parallel-processing"), creation_date = "1459613802",
question_id = "36375667")
qlist[[2]] <- list(tags = list( "r"), creation_date = "1459613803", question_id = "36375668")
I've managed to do so with the following code
library(plyr)
df_qst_tags <- ldply(qlist, function(x){ as.data.frame(cbind(tag = unlist(x$tags), question_id = x$question_id)) }, .progress = "win")
and the result is as expected: tags in a first column with a corresponding question_id in the second column.
> df_qst_tags
tag question_id
1 r 36375667
2 parallel-processing 36375667
3 r 36375668
Unfortunately my qlist is very large and my code is too slow. How to rewrite the solution in a more efficient way?
Upvotes: 1
Views: 912
Reputation: 46856
Extract the tags and find their geometry
> tags = lapply(qlist, "[[", "tags")
> lengths(tags)
[1] 2 1
You'll unlist tags
to get a vector of individual tags. Now extract the other elements, e.g., question_id, and replicate each by the tags geometry, along the lines of
data.frame(tag=unlist(tags, use.names=FALSE),
question_id = rep(
vapply(qlist, "[[", character(1), "question_id"),
lengths(tags)))
Upvotes: 3