Justin Andrew
Justin Andrew

Reputation: 97

How do I add new columns to a dataframe repeatedly in R with new names?

I have a dataframe called xlong_gwTCE_3

It is formatted like this:

  V1 study_id year mean_gwTCE index_year
  1  0000001 2010          .3       2016 
  2  0000001 2002          .6       2016
  3  0000002 2011          .2       2014
  4  0000002 2005          .4       2014
  5  0000002 2013          .8       2014
  6  0000003 2004          .1       2017

I want to perform an operation on it repeatedly and store the result in a new column.

Here is what I have tried which should make it more clear:

time_intervals <- seq(5,25,by=5)

for (years_back in time_intervals) {

  gwTCE_cumsum <- xlong_gwTCE_3%>% 
    mutate(index_yr_minus_years_back = index_year-years_back) %>% 
    filter(year >= index_yr_minus_years_back & year <= index_year) %>% 
    group_by(study_id) %>% 
    summarize(mean_gwTCE = sum(mean_gwTCE, na.rm=T))
}

The issue is I need the additional columns to have names based on the number of years_back.

For example, the first column created could be named gwTCE_cumsum_5 and the second column created could be named gwTCE_cumsum_10.

Right now it is not creating additional columns each time or renaming them.

I have been trying to use the paste function to try to rename the columns in ways like this

gwTCE_cumsum <- paste("gwTCE_cumsum", years_back, sep="_")

but it has not been working.

The output I am looking for would look something like this:

  V1 study_id year mean_gwTCE index_year gwTCE_cumsum_5 gwTCE_cumsum_10 gwTCE_cumsum_15 
  1  0000001 2010          .3       2016  0              .3               .9
  2  0000002 2011          .2       2014  .2             .6               .6
  3  0000003 2004          .1       2017  .1             .1               .1

Thank you for your help.

Upvotes: 1

Views: 38

Answers (1)

akrun
akrun

Reputation: 886938

We can use := and paste to create new columns

 for (years_back in time_intervals) {

   out <- xlong_gwTCE_3%>% 
     mutate(index_yr_minus_years_back = index_year-years_back) %>% 
      filter(year >= index_yr_minus_years_back & year <= index_year) %>% 
       group_by(study_id) %>% 
       summarize(mean_gwTCE = sum(mean_gwTCE, na.rm=T)) %>%
        mutate(!! str_c("gwTCE_cumsum", years_back, sep="_") := cumsum(mean_gwTCE))
    }

Or using map

library(purrr)
library(dplyr)
library(stringr)
map(time_intervals, ~ xlong_gwTCE_3 %>%
           filter(year >= index_year - .x, year <= index_year) %>%
           group_by(study_id) %>%
           summarise(index_year = index_year[1], 
                       mean_gwTCE = sum(mean_gwTCE, na.rm = TRUE)) %>%
           mutate(!!str_c("gwTCE_cumsum", .x, sep = "_") := cumsum(mean_gwTCE)))

Upvotes: 1

Related Questions