Alex Talbott
Alex Talbott

Reputation: 35

lapply on list of dataframes not working the same as FUN applied to dfs individually

example data

metro_2005_1 <- data.frame(col1 = 1:5, col2 = 6:10)
metro_2006_1 <- data.frame(col1 = 1:3, col2 = 4:6)

I have 20 dataframes, each named in the following format where x is a number 1-9:

metro_20XX_X

I am trying to extract the middle section into a new column, and wrote a function that works when applied on each dataframe individually called addYear.

addYear <- function(metro){
   metro_name <- deparse(substitute(metro))
   metro <- metro %>% mutate(Year = substr(metro_name,7,10))
   return(metro)
   }

example <- addYear(metro_2005_1)

str(example)

'data.frame':   5 obs. of  3 variables:
  $ col1: int  1 2 3 4 5
  $ col2: int  6 7 8 9 10
  $ Year: chr  "2005" "2005" "2005" "2005" 

I added all 20 of my dataframes into a list called metro_append_year, and tried to apply my addYear function to all 20 of the dataframes using lapply. However, when I inspect "result" the year column is created in each of my dataframes but empty.

metro_append_year <- list(metro_2005_1, metro_2006_1)

result <- lapply(metro_append_year,addYear)

str(result[[1]])
'data.frame':   5 obs. of  3 variables:
 $ col1: int  1 2 3 4 5
 $ col2: int  6 7 8 9 10
 $ Year: chr  "" "" "" ""

Upvotes: 1

Views: 187

Answers (2)

Parfait
Parfait

Reputation: 107767

Since you are a R newbie, consider a base R solution which can extract a list of objects with mget and iterate elementwise with Map (wrapper to mapply) through list names and corresponding values. Possibly the passing of names for unquoted column aliases is the issue with your dplyr call.

The within or transform functions mirrors dplyr::mutate where you can assign column(s) in place to return the object:

# ALL METRO DATA FRAMES
metro_dfs <- mget(ls(pattern="metro"))

metro_dfs <- Map(function(name, df) within(df, Year <- substr(name,7,10))),
                 names(metro_dfs), metro_dfs)

Alternatively:

metro_dfs <- mapply(function(name, df) transform(df, Year = substr(name,7,10))),
                    names(metro_dfs), metro_dfs, SIMPLIFY=FALSE)

Upvotes: 0

akrun
akrun

Reputation: 887891

We could pass the 'data' and the name of the list element as two arguments. Now, it becomes easier

addYear <- function(data, name){

   data %>% 
          mutate(Year = substr(name,7,10))

 }
lapply(names(metro_append_year), function(nm) addYear(metro_append_year[[nm]], nm))

data

metro_2005_1 <- data.frame(col1 = 1:5, col2 = 6:10)
metro_2006_1 <- data.frame(col1 = 1:3, col2 = 4:6)
metro_append_year <- mget(ls(pattern = '^metro_\\d{4}'))

Upvotes: 0

Related Questions