Reputation: 35
Help I would like to request: I am trying to automate a sequence of tasks in a loop. Basically it is running a series of regressions, storing the results and exporting (writing) it into a csv file.
Issue: I have tried many times to run a “For” loop. I couldn’t figure out how to create and name new variables for each regression based on the current_date column
Description of data: Link to dataset It is a dataset of US states in a tabular format (somewhat similar to Tidy Data). Each Row or observation refers to a specific state at a specific date. Each column refers to an attribute of the state Task: Main problem: Automate the process of running a series of regressions, storing the results by date and exporting (writing) it into a csv file.
Sequence of tasks:
….. Finally, export all the results in a CSV file
library(dplyr)
library(broom)
for(i in 1:length(current_date)){
lmtest <- lm( outcome_var [[i]] ~ cumulative_var1_at_current_date + key_cat_var1, data = testdata)
coeff_results<-tidy(lmtest)
model_results<- glance(lmtest)
write.csv(coeff_results, “C:\\Users\\....combined_results.csv”)
}
Upvotes: 0
Views: 531
Reputation: 365
Using a loop in this context does not make sense to me. Also, the date doesn't seem to affect the output of the regression. Do you want something like that:
library(dplyr)
library(broom)
current_date <- Sys.Date()
lmtest <- lm( outcome_var ~ cumulative_var1_at_current_date + key_cat_var1, data = testdata)
coeff_results <- tidy(lmtest)
model_results <- glance(lmtest)
write.csv(coeff_results, paste0("coeff-results-", current_date, ".csv"))
write.csv(model_results, paste0("model-results", current_date, ".csv"))
If not please give an example of what is the current_date
, and how your data are affected by it.
After Clarification on the purpose of the current date:
So basically, you need to loop through the unique dates in the current_date
column. Then, you have to keep the data that correspond to each day. Here is a solution.
testdata$current_date <- as.Date(testdata$current_date, format = "%d/%m/%y")
all_dates <- unique(testdata$current_date)
for(i in 1:length(all_dates)) {
lmtest <- filter(testdata, current_date %in% all_dates[i]) %>%
lm( outcome_var ~ cumulative_var1_at_current_date + key_cat_var1, data = .)
coeff_results <- tidy(lmtest)
model_results <- glance(lmtest)
write.csv(coeff_results, paste0("coeff-results-", all_dates[i], ".csv"))
write.csv(model_results, paste0("model-results", all_dates[i], ".csv"))
}
Probably there is a more elegant way through subset in lm, but I personally prefer dplyr::filter
because it gives you more control.
Upvotes: 1