Krutik
Krutik

Reputation: 501

How to add new columns to nested lists using lapply via a string recognition function

I am attempting use a %in% function to add particular columns to data frames nested within a list of lists. Below is a toy example of my data.

dput(head(list)):

 list(FEB_games = list(GAME1 = structure(list(GAME1_Class = c("paladin", 
"fighter", "wizard", "sorcerer", "rouge"), GAME1_Race = c("human", 
"elf", "orc", "human", "gnome"), GAME1_Alignment = c("NE", "CG", 
"CE", "NN", "LG"), GAME1_Level = c(6, 7, 6, 7, 7), GAME1_Alive = c("y", 
"y", "y", "y", "y")), row.names = c("m.Stan", "m.Kenny", "m.Cartman", 
"m.Kyle", "m.Butters"), class = "data.frame"), GAME2 = structure(list(
GAME2_Class = c("wizard", "cleric", "monk", "bard"), GAME2_Race = c("half-elf", 
"elf", "human", "dwarf"), GAME2_Alignment = c("CG", "CE", 
"NN", "LG"), GAME2_Level = c(5, 5, 5, 5), GAME2_Alive = c("y", 
"y", "y", "y")), row.names = c("m.Kenny", "m.Cartman", "m.Kyle", 
"m.Butters"), class = "data.frame")), MAR_games = list(GAME3 = structure(list(
GAME3_Class = c("cleric", "barbarian", "warlock", "monk"), 
GAME3_Race = c("elf", "half-elf", "elf", "dwarf"), GAME3_Alignment = c("LG", 
"LG", "CE", "LG"), GAME3_Level = c(1, 1, 1, 1), GAME3_Alive = c("y", 
"y", "y", "y")), row.names = c("l.Stan", "l.Kenny", "l.Cartman", 
"l.Butters"), class = "data.frame"), GAME4 = structure(list(GAME4_Class = c("fighter", 
"wizard", "sorcerer", "rouge"), GAME4_Race = c("half-elf", "elf", 
"human", "dwarf"), GAME4_Alignment = c("CG", "CE", "LN", "LG"
), GAME4_Level = c(5, 5, 5, 5), GAME4_Alive = c("y", "y", "y", 
"y")), row.names = c("l.Kenny", "l.Cartman", "l.Kyle", "l.Butters"), class = "data.frame")))

I have two different sets of columns (data frames) to add. Feb_detentions to Feb_games and Mar_detentions to Mar_games.

dput(head(Feb_detentions)):

structure(list(Pupil = c("m.Stan", "m.Stan", "m.Kenny", "m.Cartman", 
"m.Kyle", "Butters"), Detention = c("y", "y", "y", "n", "n", "y"
)), row.names = c(NA, 6L), class = "data.frame")

dput(head(Mar_detentions)):

structure(list(Pupil = c("l.Stan", "l.Kenny", "l.Cartman", "l.Kyle"), 
Detention = c("n", "y", "y", "n")), row.names = c(NA, 4L), class = "data.frame")

I have been successful in using these steps to add the columns of interest to a data frame (not nested in a list). Duplicates had to be removed the function, I was not able to do this inside the function.

Feb_detentions[! duplicated(Feb_detentions$Pupil),] -> Feb_detentions_dup

addDetentions <- function(df, df_namecol, detentions,  detention_namecol){
df[which(df_namecol %in% detention_namecol == T),] -> df_v1
detentions[which(detention_namecol %in% df_namecol == T),] -> det_v1
cbind(df_v1, det_v1) -> df_edit
return(df_edit)
}

addDetentions(df = GAME1, df_namecol = rownames(GAME1),
          detentions = Feb_detentions_dup, detention_namecol = Feb_detentions_dup$Pupil) -> output

dput(head(output)):

structure(list(GAME1_Class = c("paladin", "fighter", "wizard", 
"sorcerer", "rouge"), GAME1_Race = c("human", "elf", "orc", "human", 
"gnome"), GAME1_Alignment = c("NE", "CG", "CE", "NN", "LG"), 
GAME1_Level = c(6, 7, 6, 7, 7), GAME1_Alive = c("y", "y", 
"y", "y", "y"), Pupil = c("m.Stan", "m.Kenny", "m.Cartman", "m.Kyle", 
"m.Butters"), Detention = c("y", "y", "n", "n", "y")), row.names =  c("m.Stan", "m.Kenny", "m.Cartman", "m.Kyle", "m.Butters"), class = "data.frame")

I would like to perform this function (or something else that works) to the entire list. But since there are two different set of columns to add to two different nested lists in a single list I am a bit stuck.

lapply(Chars_alive, function(x) {addDetentions(x, rownames(x), Feb_detentions, Feb_detentions$Pupil)})

Any help would be appreciated here.


Upvotes: 2

Views: 78

Answers (1)

akrun
akrun

Reputation: 887951

One option is to do a merge between the nested data.frames of the list and a corresponding list created in the same order as the names (month names of the first list). Map does the looping through the corresponding list elements

Map(function(x, y) 
   # x is the first list which is a nested one
   # so loop through it
   lapply(x, function(dat) {
      # create a Pupil column from the row names
      dat$Pupil <- row.names(dat)
      # merge with the corresponding 'detentions' dataset
      merge(dat, y)
      }),
      # first list, created list
      lst1, list(Feb_detentions, Mar_detentions)) 

With tidyverse, this can be done using map2

library(tidyverse)
map2(lst1, list(Feb_detentions, Mar_detentions),
       ~ {
         ydat <- .y
         map(.x, ~ .x %>%
                    rownames_to_column("Pupil") %>% 
                    inner_join(ydat))
         })

Update

If we need to only update the second nested list from the 'lst1', just extract that list element and do the merge

Map(function(x, y)  x[[2]] <- {
      x[[2]]$Pupil <- row.names(x[[2]])
     merge(x[[2]], y)
      x
      }, lst1, list(Feb_detentions, Mar_detentions))

Upvotes: 1

Related Questions