DJC
DJC

Reputation: 1611

Using lapply over a list and adding a column with data frame name

I have a list containing two data frames:

sample_list <- list("tables" = data.frame(weight = sample(1:50, 20, replace = T)),
                    "chairs" = data.frame(height = sample(1:50, 20, replace = T)))

I would like to use lapply to run a function over all the data frames in this list. In the output of each function, I need to create another column with the name of the source data frame (see mutate):

lapply(sample_list, function(x) {
  x %>% 
    filter(x >= 20) %>% 
    mutate(groupName = names(x))
})

For some reason, I can't figure out how to make this work. How do I pass the name of the data frame into mutate? Right now it is returning the name of the first column in that data frame, rather than the name of the data frame itself.

Thanks!

Upvotes: 6

Views: 2244

Answers (4)

GKi
GKi

Reputation: 39647

You can use Map with the function data.frame to add the names.

Map(`data.frame`, sample_list, groupName = names(sample_list))
#Map(`[<-`, sample_list, "groupName", value = names(sample_list)) #Alternative

#$tables
#   weight groupName
#1      22    tables
#2      12    tables
#3       9    tables
#4      26    tables
#5      39    tables
#6       6    tables
#7      31    tables
#8       9    tables
#9      39    tables
#10      4    tables
#11     37    tables
#12     30    tables
#13     20    tables
#14     35    tables
#15     31    tables
#16     46    tables
#17     44    tables
#18     30    tables
#19     12    tables
#20     46    tables
#
#$chairs
#   height groupName
#1      12    chairs
#2      17    chairs
#3      35    chairs
#4      40    chairs
#5      23    chairs
#6      21    chairs
#7      48    chairs
#8      24    chairs
#9      20    chairs
#10     41    chairs
#11     43    chairs
#12     45    chairs
#13     47    chairs
#14     13    chairs
#15     35    chairs
#16     32    chairs
#17     26    chairs
#18     34    chairs
#19     33    chairs
#20      8    chairs

In case it should also be subseted to those >= 20:

lapply(sample_list, function(x) x[x[,1] >= 20,, drop = FALSE])

When it should be done in one step I would use the way already posted by @akrun.

Upvotes: 1

A. Suliman
A. Suliman

Reputation: 13125

We can loop through names of sample_list instead of looping through the list

lapply(names(sample_list), function(x) {
    sample_list[[x]] %>% 
        filter_at(vars(1),~. >= 20) %>% 
        mutate(groupName = x)
})

Update Sep-2021

cleaner way using purrr::map

purrr::map(names(sample_list), ~sample_list[[.x]] %>% 
             filter_at(vars(1),~. >= 20) %>% 
             mutate(groupName = .x)
)

Upvotes: 7

akrun
akrun

Reputation: 886948

Using Map from base R

Map(function(dat, grp) cbind(dat, group_name = grp)[dat[[1]] > 20,], 
             sample_list, names(sample_list))

Upvotes: 1

Yifu Yan
Yifu Yan

Reputation: 6106

You can try purrr::imap() to map over both elements and elements' name.

# purrr::imap
purrr::imap(sample_list, function(element,name){
    head(mutate(element,groupName = name))
})

# or mapply, but you need to specify names of the list
myfun <- function(element,name){
    head(mutate(element,groupName = name))
}

mapply(myfun,sample_list,names(sample_list),SIMPLIFY = FALSE)

$tables
  weight groupName
1     42    tables
2     24    tables
3     13    tables
4     31    tables
5      9    tables
6     27    tables

$chairs
  height groupName
1     18    chairs
2      6    chairs
3     34    chairs
4     37    chairs
5     36    chairs
6     49    chairs

Upvotes: 2

Related Questions