Reputation: 7127
I am trying to process a number of lists and I am losing the names
of some of the list elements.
The list looks like:
> myLists2
[[1]]
NULL
[[2]]
[[2]][[1]]
title company date_range location
"Founder | Co-CEO" "someCompany" "ene. de 2018 \023 actualidad" "Europe"
description li_company_url
"some description 1" "https://www.google.com"
[[2]][[2]]
title company date_range location
"Another title" "someCompany2" "ene. de 2019 \023 actualidad" "USA"
description li_company_url
"Another Description" "https://www.yahoo.com"
[[2]][[3]]
title company date_range location
"Another title 3" "Another company 3" "sept. de 2018 \023 actualidad" "Europe"
description li_company_url
"Another description 3" "https://www.stackexchange.com"
Where if I run names(myLists2[[2]][[1]])
I get the following:
[1] "title" "company" "date_range" "location" "description" "li_company_url"
The number of names can slightly vary over the different lists and I would like to create a new column where the names
appear in a data.frame
.
Running:
hh <- myLists2[[2]] %>% data.frame() %>% rownames_to_column("tag")
Gives me a nice data frame where I use the rownames_to_column()
function to save the rownames, however this gives me an error when the list elements are different lengths.
Error in (function (..., row.names = NULL, check.rows = FALSE, check.names = TRUE, : arguments imply differing number of rows: 5, 6
A solution I found to this was to use bind_rows()
. Running:
myLists2[[2]] %>% bind_rows()
Gives me a tibble
but I lose the names
from the lists. Running:
myLists2[[2]] %>% bind_rows(.id = "myID")
Does not seem to solve the issue either since it just gives me a new column from 1 to 3.
My question is, how can I use the bind_rows()
(which are not sensitive to differing column lengths) and also save the names
from the lists as a column?
Data:
myLists2 <- list(NULL, list(c(title = "Founder | Co-CEO", company = "someCompany",
date_range = "ene. de 2018 \023 actualidad", location = "Europe",
description = "some description 1", li_company_url = "https://www.google.com"
), c(title = "Another title", company = "someCompany2", date_range = "ene. de 2019 \023 actualidad",
location = "USA", description = "Another Description", li_company_url = "https://www.yahoo.com"
), c(title = "Another title 3", company = "Another company 3",
date_range = "sept. de 2018 \023 actualidad", location = "Europe",
description = "Another description 3", li_company_url = "https://www.stackexchange.com"
)))
EDIT: (Adding a new list)
myNewList <- list(list(c(title = "Founder | Co-CEO", company = "some company",
date_range = "ene. de 2018 \023 actualidad", location = "Europe",
description = "some description",
li_company_url = "https://www.google.com"
), c(title = "some thing here", company = "some company",
date_range = "ene. de 2019 \023 actualidad", location = "USA",
description = "another description",
li_company_url = "https://www.yahoo.com")
), list(c(title = "CEO", company = "another company",
date_range = "2012 \023 actualidad", description = "some other description",
li_company_url = ""), c(title = "job title",
company = "company name", date_range = "ene. de 2005 \023 actualidad",
location = "Europe", description = "company description",
li_company_url = "https://www.yahoo.com"),
c(title = "job title 2", company = "company name", date_range = "1995 \023 actualidad",
description = "description",
li_company_url = ""), c(title = "job title",
company = "company name", date_range = "1992 \023 1995",
location = "USA", description = "soem company description",
li_company_url = ""), c(title = "company title", company = "company name",
date_range = "1990 \023 1992", description = "Another description",
li_company_url = "")), NULL)
These show the problems I am running into:
map(myNewList, ~data.frame(.x))
map(myNewList[1], ~data.frame(.x)) # runs okay and I keep the names
map(myNewList[2], ~data.frame(.x)) # errors
map(myNewList, ~bind_rows(.x)) # runs okay but I lsoe the names
Upvotes: 3
Views: 1322
Reputation: 3266
Another possibility using only purrr
, dplyr
and tibble
:
myNewList %>%
map_if(~!is.null(.),
function(mylist) map(mylist,
~data.frame(.x) %>%
rownames_to_column("tag")) %>%
reduce(full_join, by = "tag"))
[[1]]
tag .x.x .x.y
1 title Founder | Co-CEO some thing here
2 company some company some company
3 date_range ene. de 2018 \023 actualidad ene. de 2019 \023 actualidad
4 location Europe USA
5 description some description another description
6 li_company_url https://www.google.com https://www.yahoo.com
[[2]]
tag .x.x .x.y .x.x.x .x.y.y .x
1 title CEO job title job title 2 job title company title
2 company another company company name company name company name company name
3 date_range 2012 \023 actualidad ene. de 2005 \023 actualidad 1995 \023 actualidad 1992 \023 1995 1990 \023 1992
4 description some other description company description description soem company description Another description
5 li_company_url https://www.yahoo.com
6 location <NA> Europe <NA> USA <NA>
[[3]]
NULL
Or removing empty lists:
inter_list <- map(myNewList, function(mylist) map(mylist, ~data.frame(.x) %>% rownames_to_column("tag")))
nullw <- which(map_lgl(inter_list, ~length(.x)==0))
if(length(nullw)!=0) inter_list <- inter_list[-nullw]
map(inter_list, ~reduce(.x, full_join, by = "tag"))
[[1]]
tag .x.x .x.y
1 title Founder | Co-CEO some thing here
2 company some company some company
3 date_range ene. de 2018 \023 actualidad ene. de 2019 \023 actualidad
4 location Europe USA
5 description some description another description
6 li_company_url https://www.google.com https://www.yahoo.com
[[2]]
tag .x.x .x.y .x.x.x .x.y.y .x
1 title CEO job title job title 2 job title company title
2 company another company company name company name company name company name
3 date_range 2012 \023 actualidad ene. de 2005 \023 actualidad 1995 \023 actualidad 1992 \023 1995 1990 \023 1992
4 description some other description company description description soem company description Another description
5 li_company_url https://www.yahoo.com
6 location <NA> Europe <NA> USA <NA>
Upvotes: 0
Reputation: 887431
We could use map_if
with data.table::transpose
after doing the bind_rows
library(purrr)
library(dplyr)
library(tibble)
library(data.table)
map_if(myNewList, .p = ~ length(.) > 0,
.f = ~bind_rows(.x) %>%
data.table::transpose(., keep.names = 'title') %>%
column_to_rownames('title'),
.else = ~ NA_character_)
-output
#[[1]]
# V1 V2
#title Founder | Co-CEO some thing here
#company some company some company
#date_range ene. de 2018 \023 actualidad ene. de 2019 \023 actualidad
#location Europe USA
#description some description another description
#li_company_url https://www.google.com https://www.yahoo.com
#[[2]]
# V1 V2 V3 V4
#title CEO job title job title 2 job title
#company another company company name company name company name
#date_range 2012 \023 actualidad ene. de 2005 \023 actualidad 1995 \023 actualidad 1992 \023 1995
#description some other description company description description soem company description
#li_company_url https://www.yahoo.com
#location <NA> Europe <NA> USA
# V5
#title company title
#company company name
#date_range 1990 \023 1992
#description Another description
#li_company_url
#location <NA>
#[[3]]
#[1] NA
Upvotes: 1
Reputation: 39613
After trying many options I found a rustic method to obtain what you want. It uses rbind.fill()
function from plyr
so be careful when loading the package as dplyr
has conflicts with it. The main idea (that uses a loop) transform your valus to a dataframe, then transpose to have columns and can bind by rows so that empty space can be filled with NA
(that is why we used the plyr
function). The pro is that in the loop you can manage the NULL
elements with a conditional. Here the code with the new data you shared:
library(plyr)
#Create a list to store the results
List <- list()
#Loop index2
for(i in 1:length(myNewList))
{
v <- length(myNewList[[i]])
#Conditional
if(v==0)
{
List[[i]] <- NA
} else
{
#Check length for NULL elements
#First transform to dataframe in a column format
#This will make easy to join
O1 <- lapply(myNewList[[i]],function(x) as.data.frame(t(x)))
#Now bind all with rbind.fill to avoid issues with different number of variables you had
O2 <- do.call(rbind.fill,O1)
#Finally transpose to have a format similar to what you want
O3 <- as.data.frame(t(O2))
#Save in List
List[[i]] <- O3
}
}
Output:
List
[[1]]
V1 V2
title Founder | Co-CEO some thing here
company some company some company
date_range ene. de 2018 \023 actualidad ene. de 2019 \023 actualidad
location Europe USA
description some description another description
li_company_url https://www.google.com https://www.yahoo.com
[[2]]
V1 V2 V3
title CEO job title job title 2
company another company company name company name
date_range 2012 \023 actualidad ene. de 2005 \023 actualidad 1995 \023 actualidad
description some other description company description description
li_company_url https://www.yahoo.com
location <NA> Europe <NA>
V4 V5
title job title company title
company company name company name
date_range 1992 \023 1995 1990 \023 1992
description soem company description Another description
li_company_url
location USA <NA>
[[3]]
[1] NA
Upvotes: 1