user11325078
user11325078

Reputation:

How can I program a loop in R?

How can I program a loop so that all eight tables are calculated one after the other?

The code:

dt_M1_I <- M1_I
dt_M1_I <- data.table(dt_M1_I)
dt_M1_I[,I:=as.numeric(gsub(",",".",I))]
dt_M1_I[,day:=substr(t,1,10)]
dt_M1_I[,hour:=substr(t,12,16)]
dt_M1_I_median <- dt_M1_I[,list(median_I=median(I,na.rm = TRUE)),by=.(day,hour)]

This should be calculated for:

M1_I
M2_I
M3_I
M4_I
M1_U
M2_U
M3_U
M4_U

Thank you very much for your help!

Upvotes: 0

Views: 59

Answers (2)

Konrad Rudolph
Konrad Rudolph

Reputation: 546073

Whenever you have several variables of the same kind, especially when you find yourself numbering them, as you did, step back and replace them with a single list variable. I do not recommend doing what the other answer suggested.

That is, instead of M1_IM4_I and M1_UM4_U, have two variables m_i and m_u (using lower case in variable names is conventional), which are each lists of four data.tables.

Alternatively, you might want to use a single variable, m, which contains nested lists of data.tables (m = list(list(i = …, u = …), …)).

Assuming the first, you can then iterate over them as follows:

give_this_a_meaningful_name = function (df) {
    dt <- data.table(df)
    dt[, I := as.numeric(gsub(",", ".", I))]
    dt[, day := substr(t, 1, 10)]
    dt[, hour := substr(t, 12, 16)]
    dt[, list(median_I = median(I, na.rm = TRUE)), by = .(day, hour)]
}

m_i_median = lapply(m_i, give_this_a_meaningful_name)

(Note also the introduction of consistent spacing around operators; good readability is paramount for writing bug-free code.)

Upvotes: 4

morgan121
morgan121

Reputation: 2253

You can use a combination of a for loop and the get/assign functions like this:

# create a vector of the data.frame names
dts <- c('M1_I', 'M2_I', 'M3_I', 'M4_I', 'M1_U', 'M2_U', 'M3_U', 'M4_U')

# iterate over each dataframe
for (dt in dts){

  # get the actual dataframe (not the string name of it)
  tmp <- get(dt)
  tmp <- data.table(tmp)
  tmp[, I:=as.numeric(gsub(",",".",I))]
  tmp[, day:=substr(t,1,10)]
  tmp[, hour:=substr(t,12,16)]
  tmp <- tmp[,list(median_I=median(I,na.rm = TRUE)),by=.(day,hour)]

  # assign the modified dataframe to the name you want (the paste adds the 'dt_' to the front)
  assign(paste0('dt_', dt), tmp)

}

Upvotes: 0

Related Questions