Ochetski
Ochetski

Reputation: 115

How to map a dataframe with a list column (to be mapped) and an id in R

I want to do a map in a list inside a data.frame, bind the rows and repeat a column ID in one step.

I was able to do doing each part manually, but wasn't able to do in one step process, if there is any function or a specific way to use the mapping functions.

Data header:

data <- structure(list(mecanicas = list(structure(list(name = c("Campaign / Battle Card Driven", 
"Cooperative Play", "Grid Movement", "Hand Management", "Modular Board", 
"Role Playing"), objecttype = c("property", "property", "property", 
"property", "property", "property"), objectid = c("2018", "2023", 
"2676", "2040", "2011", "2028"), primarylink = c(0L, 0L, 0L, 
0L, 0L, 0L), itemstate = c("approved", "approved", "approved", 
"approved", "approved", "approved"), href = c("/boardgamemechanic/2018/campaign-battle-card-driven", 
"/boardgamemechanic/2023/cooperative-play", "/boardgamemechanic/2676/grid-movement", 
"/boardgamemechanic/2040/hand-management", "/boardgamemechanic/2011/modular-board", 
"/boardgamemechanic/2028/role-playing")), class = "data.frame", row.names = c(NA, 
6L)), structure(list(name = c("Action Point Allowance System", 
"Cooperative Play", "Hand Management", "Point to Point Movement", 
"Set Collection", "Trading"), objecttype = c("property", "property", 
"property", "property", "property", "property"), objectid = c("2001", 
"2023", "2040", "2078", "2004", "2008"), primarylink = c(0L, 
0L, 0L, 0L, 0L, 0L), itemstate = c("approved", "approved", "approved", 
"approved", "approved", "approved"), href = c("/boardgamemechanic/2001/action-point-allowance-system", 
"/boardgamemechanic/2023/cooperative-play", "/boardgamemechanic/2040/hand-management", 
"/boardgamemechanic/2078/point-point-movement", "/boardgamemechanic/2004/set-collection", 
"/boardgamemechanic/2008/trading")), class = "data.frame", row.names = c(NA, 
6L)), structure(list(name = c("Action Point Allowance System", 
"Auction/Bidding", "Card Drafting"), objecttype = c("property", 
"property", "property"), objectid = c("2001", "2012", "2041"), 
    primarylink = c(0L, 0L, 0L), itemstate = c("approved", "approved", 
    "approved"), href = c("/boardgamemechanic/2001/action-point-allowance-system", 
    "/boardgamemechanic/2012/auctionbidding", "/boardgamemechanic/2041/card-drafting"
    )), class = "data.frame", row.names = c(NA, 3L)), list()), 
    title = c("Gloomhaven", "Pandemic Legacy: Season 1", "Through the Ages: A New Story of Civilization", 
    "KLASK")), row.names = c(NA, -4L), class = c("tbl_df", "tbl", 
"data.frame"))

Structure:


  mecanicas            title                                        
  <list>               <chr>                                        
1 <data.frame [6 x 6]> Gloomhaven                                   
2 <data.frame [6 x 6]> Pandemic Legacy: Season 1                    
3 <data.frame [3 x 6]> Through the Ages: A New Story of Civilization
4 <list [0]>           KLASK                                        

The way I do it and want to simplify:

library('tidyverse')

### map and bind the rows
mechanics_binded <- map_dfr(data$mecanicas, bind_rows) 

### then count the mechanics for repetition
n_mecs <- lapply(data[['mecanicas']], nrow) %>% as.character() %>% as.numeric()

##(some lists can be empty, but none in the data sample)
n_mecs[is.na(n_mecs)] <- 0 


titles <- rep(data$title, n_mecs)

mechanics_binded$titles <- titles 

mechanics <- mechanics_binded [,c('name', 'jogos')]

mechanics 

Desired result:

                            name                     title
1  Campaign / Battle Card Driven                Gloomhaven
2               Cooperative Play                Gloomhaven
3                  Grid Movement                Gloomhaven
4                Hand Management                Gloomhaven
5                  Modular Board                Gloomhaven
6                   Role Playing                Gloomhaven
7  Action Point Allowance System Pandemic Legacy: Season 1
8               Cooperative Play Pandemic Legacy: Season 1
9                Hand Management Pandemic Legacy: Season 1
10       Point to Point Movement Pandemic Legacy: Season 1

EDIT: The mecanicas column can be an empty list as well, otherwiser it's well structured.

EDIT2: added one of the edge case from edit1 that a list is empty (error in tidyverse solution case). The other error (data.table solution) I was unable to reproduce without the whole data, so I'm sharing here through dropbox link. https://www.dropbox.com/s/boh8k0epay4gedh/bgg_mechanics.RData?dl=0

Upvotes: 0

Views: 1473

Answers (3)

Uwe
Uwe

Reputation: 42544

If I understand correctly, data consists of

  • a list mecanicas of data frames with identical structure (number, name, and type of columns) and
  • a character vector title with the same number of elements as there are data frames in mecanicas

An alternative approach is to use rbindlist() to "flatten" the data structure, i.e., combine the pieces into one large data frame.

library(data.table)
# combine pieces to large data frame, add id col
flat <- rbindlist(data$mecanicas, idcol = "title")
# replace number in id col by title from character vector
flat[, title := data$title[title]][]
# extract desired columns
flat[, .(name, title)]
                             name                                         title
 1: Campaign / Battle Card Driven                                    Gloomhaven
 2:              Cooperative Play                                    Gloomhaven
 3:                 Grid Movement                                    Gloomhaven
 4:               Hand Management                                    Gloomhaven
 5:                 Modular Board                                    Gloomhaven
 6:                  Role Playing                                    Gloomhaven
 7: Action Point Allowance System                     Pandemic Legacy: Season 1
 8:              Cooperative Play                     Pandemic Legacy: Season 1
 9:               Hand Management                     Pandemic Legacy: Season 1
10:       Point to Point Movement                     Pandemic Legacy: Season 1
11:                Set Collection                     Pandemic Legacy: Season 1
12:                       Trading                     Pandemic Legacy: Season 1
13: Action Point Allowance System Through the Ages: A New Story of Civilization
14:               Auction/Bidding Through the Ages: A New Story of Civilization
15:                 Card Drafting Through the Ages: A New Story of Civilization

As the OP has reported some errors with the production dataset, here are a few checks to verify above assumptions hold:

library(magrittr)
# check that number of columns of data frames is consistent
stopifnot(lengths(data$mecanicas) %>% all(.[1] == .))
# or, without piping:
(tmp <- lengths(data$mecanicas))
stopifnot(all(tmp[1] == tmp))
# check that number of data frames and titles is consistent
stopifnot(length(data$mecanicas) == length(data$title))

Upvotes: 1

IceCreamToucan
IceCreamToucan

Reputation: 28675

With data table you can lapply over mecanicas by title, and it will repeat the title for you.

library(data.table)
setDT(data)

data[, lapply(mecanicas, `[[`, 'name'), by = title]

#                                             title                            V1
#  1:                                    Gloomhaven Campaign / Battle Card Driven
#  2:                                    Gloomhaven              Cooperative Play
#  3:                                    Gloomhaven                 Grid Movement
#  4:                                    Gloomhaven               Hand Management
#  5:                                    Gloomhaven                 Modular Board
#  6:                                    Gloomhaven                  Role Playing
#  7:                     Pandemic Legacy: Season 1 Action Point Allowance System
#  8:                     Pandemic Legacy: Season 1              Cooperative Play
#  9:                     Pandemic Legacy: Season 1               Hand Management
# 10:                     Pandemic Legacy: Season 1       Point to Point Movement
# 11:                     Pandemic Legacy: Season 1                Set Collection
# 12:                     Pandemic Legacy: Season 1                       Trading
# 13: Through the Ages: A New Story of Civilization Action Point Allowance System
# 14: Through the Ages: A New Story of Civilization               Auction/Bidding
# 15: Through the Ages: A New Story of Civilization                 Card Drafting

Upvotes: 2

dave-edison
dave-edison

Reputation: 3726

You want tidyr::unnest:

library(tidyverse)

data %>% 
  unnest(mecanicas) %>% 
  select(name, title)

Upvotes: 2

Related Questions