Clara
Clara

Reputation: 33

Create multiple columns in R using a formula

I'm a bit new to R and trying to find a simplified way of creating multiple columns based on a formula.

I have a dataset that has a base date followed by scores that were taken weekly (score1 = score from 1 week after base date). I would like to generate a date for each week i.e. adding X*7 to the base date. I have found a way to do this by simply creating each date variable one at a time (see below) but since I have over 500 scores, I was wondering if there is a simplified way of doing this that does not take up hundreds of lines of code.

Dataset$score1_date <- Dataset$base_date + (1*7)
Dataset$score2_date <- Dataset$base_date + (2*7)
Dataset$score3_date <- Dataset$base_date + (3*7)

Here is an example dataset:

Dataset <- structure(list(id = c(1, 2, 3), base_date = structure(c(18628, 18633, 18641), class = "Date"), score1 = c(4, 5, 5), score2 = c(6, 5, 2), score3 = c(5, 5, 1)), row.names = c(NA, -3L), class = c("tbl_df", "tbl", "data.frame"))

Thank you!

Upvotes: 3

Views: 987

Answers (2)

s20012303
s20012303

Reputation: 89

You can try using a for loop and indicating a column of a data.frame using double brackets (i.e. [[.]]). For example:

for (i in c(1:500)){
  Dataset[[paste0("score", i, "_date")]] <- Dataset$base_date + (i*7)
}

Upvotes: 1

akrun
akrun

Reputation: 887028

We can use lapply to loop over the multiplier index i.e 1:3 in the OP's post, multiply by 7 and add to base_date, then assign the list of vectors to new columns by pasteing the 'score' with the index and '_date'

Dataset[paste0('score', 1:3, '_date')] <- lapply(1:3, 
          function(i) Dataset$base_date + i*7)   

Or using dplyr, loop across the 'score' columns, extract the numeric part from the column name (cur_column()) with parse_number, multiply by 7 and add to 'base_date' while modifying the column names in .names by adding the '_date' to create new columns

library(dplyr)
Dataset <- Dataset %>% 
   mutate(across(starts_with('score'), ~ base_date + 
     (readr::parse_number(cur_column())) * 7, .names = '{.col}_date'))

-output

Dataset
# A tibble: 3 x 8
#     id base_date  score1 score2 score3 score1_date score2_date score3_date
#  <dbl> <date>      <dbl>  <dbl>  <dbl> <date>      <date>      <date>     
#1     1 2021-01-01      4      6      5 2021-01-08  2021-01-15  2021-01-22 
#2     2 2021-01-06      5      5      5 2021-01-13  2021-01-20  2021-01-27 
#3     3 2021-01-14      5      2      1 2021-01-21  2021-01-28  2021-02-04 

Upvotes: 2

Related Questions