UseR10085
UseR10085

Reputation: 8200

Creating all possible variable combinations in R

I am having a daily dataset of 4 parameters which I have converted into monthly data using following code

library(zoo)
library(hydroTSM)
library(lubridate)
library(tidyverse)

set.seed(123)
df <- data.frame("date"= seq(from = as.Date("1983-1-1"), to = as.Date("2018-12-31"), by = "day"),
                 "Parameter1" = runif(length(seq.Date(as.Date("1983-1-1"), as.Date("2018-12-31"), "days")), 15, 35),
                 "Parameter2" = runif(length(seq.Date(as.Date("1983-1-1"), as.Date("2018-12-31"), "days")), 11, 29),
                 "Parameter3" = runif(length(seq.Date(as.Date("1983-1-1"), as.Date("2018-12-31"), "days")), 50, 90),
                 "Parameter4" = runif(length(seq.Date(as.Date("1983-1-1"), as.Date("2018-12-31"), "days")), 0, 27))

Monthly_data <- daily2monthly(df, FUN=mean, na.rm=TRUE)

After that, I have reshaped it to represent each column as month using following code

#Function to convert month abbreviation to a numeric month
mo2Num <- function(x) match(tolower(x), tolower(month.abb))

Monthly_data %>% 
  dplyr::as_tibble(rownames = "date") %>% 
  separate("date", c("Month", "Year"), sep = "-", convert = T) %>% 
  mutate(Month = mo2Num(Month))%>% 
  tidyr::pivot_longer(cols = -c(Month, Year)) %>% 
  pivot_wider(names_from = Month, values_from = value, names_prefix = "Mon",
              names_sep = "_") %>% 
  arrange(name)

Now, I want to create parameter combinations like Parameter1 * Parameter2, Parameter1 * Parameter3, Parameter1 * Parameter4, Parameter2 * Parameter3, Parameter2 * Parameter4, Parameter3 * Parameter4 which will be added to the pivoted monthly data as rbind. The new dataframe Parameter1 * Parameter2 means to multiply their monthly values and then rbind to the above result. Likewise for all other above said combinations. How can I achieve this?

Upvotes: 0

Views: 210

Answers (1)

Ronak Shah
Ronak Shah

Reputation: 389265

You can use this base R approach using combn assuming data is present for all the years for all parameters where df1 is the dataframe from the above output ending with arrange(name).

data <- combn(unique(df1$name), 2, function(x) {
               t1 <- subset(df1, name == x[1])
               t2 <- subset(df1, name == x[2])
               t3 <- t1[-(1:2)] * t2[-(1:2)]
               t3$name <- paste0(x, collapse = "_")
               cbind(t3, t1[1])
               }, simplify = FALSE)

You can then rbind it to original data.

new_data <- rbind(df1, do.call(rbind, data))

Upvotes: 1

Related Questions