Reputation: 1036
I have data in 2 columns (Region and GroupedCateg). Refer the dataframe below. I want to convert it into nested lists. I tried using group_by and do() function of dplyr and then convert into list but it didn't work.
df <- read.table(header = T,
text = '
Region GroupedCateg
Beja Alentejo
Evora Alentejo
Portalegre Alentejo
Faro Algarve
Aveiro Central
"Castelo Branco" Central
Coimbra Central
Leiria Central
Santarem Central
Acores Islands
Madeira Islands
Lisboa Lisbon
Setubal Lisbon
Braga North
Braganca North
"Viana do Castelo" North
"Vila Real" North
Porto Porto
')
Desired Output in lists. Region would be in names. Respective GroupedCateg in nested list
list(
list(
name = "Alentejo",
categories = list("Beja", "Evora", "Portalegre")
),
list(
name = "Algarve",
categories = list("Faro")
),
list(
name = "Central",
categories = list("Aveiro", "Castelo Branco", "Coimbra", "Leiria", "Santarem" )
),
list(
name = "North",
categories = list("Braga", "Braganca", "Viana do Castelo", "Vila Real")
),
list(
name = "Lisbon",
categories = list("Lisboa", "Setubal")
),
list(
name = "Islands",
categories = list("Acores", "Madeira")
),
list(
name = "Porto",
categories = list("Porto")
)
)
Upvotes: 1
Views: 167
Reputation: 56149
Using base split on column, then lapply to re-format as desired:
x <- split(df$Region, df$GroupedCateg)
res <- lapply(names(x), function(i){
list(name = i,
categories = as.list(x[[ i ]]))
})
Upvotes: 2
Reputation: 35554
You can use pmap()
in purrr
.
library(dplyr)
library(purrr)
x <- df %>%
group_by(GroupedCateg) %>%
summarise(Region = list(Region)) %>%
pmap(~ list(name = .x, categories = as.list(.y)))
The corresponding base
R version:
y <- apply(aggregate(Region ~ GroupedCateg, df, c),
1, function(y) list(name = y[[1]], categories = as.list(y[[2]])))
all.equal(x, y)
# [1] TRUE
Output
[[1]]
[[1]]$name
[1] "Alentejo"
[[1]]$categories
[[1]]$categories[[1]]
[1] "Beja"
[[1]]$categories[[2]]
[1] "Evora"
[[1]]$categories[[3]]
[1] "Portalegre"
[[2]]
[[2]]$name
[1] "Algarve"
[[2]]$categories
[[2]]$categories[[1]]
[1] "Faro"
[[3]]
[[3]]$name
[1] "Central"
[[3]]$categories
[[3]]$categories[[1]]
[1] "Aveiro"
[[3]]$categories[[2]]
[1] "Castelo Branco"
[[3]]$categories[[3]]
[1] "Coimbra"
[[3]]$categories[[4]]
[1] "Leiria"
[[3]]$categories[[5]]
[1] "Santarem"
[[4]]
[[4]]$name
[1] "Islands"
[[4]]$categories
[[4]]$categories[[1]]
[1] "Acores"
[[4]]$categories[[2]]
[1] "Madeira"
[[5]]
[[5]]$name
[1] "Lisbon"
[[5]]$categories
[[5]]$categories[[1]]
[1] "Lisboa"
[[5]]$categories[[2]]
[1] "Setubal"
[[6]]
[[6]]$name
[1] "North"
[[6]]$categories
[[6]]$categories[[1]]
[1] "Braga"
[[6]]$categories[[2]]
[1] "Braganca"
[[6]]$categories[[3]]
[1] "Viana do Castelo"
[[6]]$categories[[4]]
[1] "Vila Real"
[[7]]
[[7]]$name
[1] "Porto"
[[7]]$categories
[[7]]$categories[[1]]
[1] "Porto"
Upvotes: 3
Reputation: 4358
In Base-R
apply(aggregate(Region~GroupedCateg,df,c),1, function(x) list(name=x[1], category=as.list(x[2]$Region)))
[[1]]
[[1]]$name
[[1]]$name$GroupedCateg
[1] "Alentejo"
[[1]]$category
[[1]]$category[[1]]
[1] "Beja"
[[1]]$category[[2]]
[1] "Evora"
[[1]]$category[[3]]
[1] "Portalegre"
[[2]]
[[2]]$name
[[2]]$name$GroupedCateg
[1] "Algarve"
[[2]]$category
[[2]]$category[[1]]
[1] "Faro"
[[3]]
[[3]]$name
[[3]]$name$GroupedCateg
[1] "Central"
[[3]]$category
[[3]]$category[[1]]
[1] "Aveiro"
[[3]]$category[[2]]
[1] "Castelo Branco"
[[3]]$category[[3]]
[1] "Coimbra"
[[3]]$category[[4]]
[1] "Leiria"
[[3]]$category[[5]]
[1] "Santarem"
[[4]]
[[4]]$name
[[4]]$name$GroupedCateg
[1] "Islands"
[[4]]$category
[[4]]$category[[1]]
[1] "Acores"
[[4]]$category[[2]]
[1] "Madeira"
[[5]]
[[5]]$name
[[5]]$name$GroupedCateg
[1] "Lisbon"
[[5]]$category
[[5]]$category[[1]]
[1] "Lisboa"
[[5]]$category[[2]]
[1] "Setubal"
[[6]]
[[6]]$name
[[6]]$name$GroupedCateg
[1] "North"
[[6]]$category
[[6]]$category[[1]]
[1] "Braga"
[[6]]$category[[2]]
[1] "Braganca"
[[6]]$category[[3]]
[1] "Viana do Castelo"
[[6]]$category[[4]]
[1] "Vila Real"
[[7]]
[[7]]$name
[[7]]$name$GroupedCateg
[1] "Porto"
[[7]]$category
[[7]]$category[[1]]
[1] "Porto"
Upvotes: 1