supreeth2812
supreeth2812

Reputation: 117

R - How to run nested loops over list of variables and their values

I am trying to run a function which takes filter name and filer value as input.

get_rules <- function(data_set, filter_col, filter_value) {
    *do something*
    return (list(df1, df2))
}

I have to run this for a set of columns with different values.

year = c(2018,2019,2020)
region = c('AMER', 'APAC', 'EMEA')

I am trying to run this function in a loop for all the different values in each list. For that, I need to pass the variable name and variable values I am trying to run a nested loop.

columns = list(year, region )

df_a <- data.frame()
df_b <- data.frame()

for (i in columns){
  print("out loop ")
  print(i)
  for (j in i){
    print("in loop ")
    print(i)
    print(j)
    #df_loop <- user_func(df, i, j)
    #df_a <- rbind(df_a, df_loop[1])
    #df_b <- rbind(df_b, df_loop[2])
  }
}

>> Output is
[1] "out loop "
[1] 2018 2019 2020
[1] "in loop "
[1] 2018 2019 2020
[1] 2018
[1] "in loop "
[1] 2018 2019 2020
[1] 2019
[1] "in loop "
[1] 2018 2019 2020
[1] 2020
[1] "out loop "
[1] "AMER" "APAC" "EMEA"
[1] "in loop "
[1] "AMER" "APAC" "EMEA"
[1] "AMER"
[1] "in loop "
[1] "AMER" "APAC" "EMEA"
[1] "APAC"
[1] "in loop "
[1] "AMER" "APAC" "EMEA"
[1] "EMEA"

I am a native python user and it is pretty much forward in python, but I am unable to write this in R.

>> Output required is 
[1] "out loop "
[1] 'year'
[1] "in loop "
[1] 2018
[1] "in loop "
[1] 2019
[1] "in loop "
[1] 2020
[1] "out loop "
[1] "region"
[1] "in loop "
[1] "AMER"
[1] "in loop "
[1] "APAC"
[1] "in loop "
[1] "EMEA"

Upvotes: 0

Views: 942

Answers (2)

oszkar
oszkar

Reputation: 992

Your solution will work -- a bit modified -- if you use data.frame instead of a list, and if you iterate through the column name of the data.frame in the outer loop (or you can also stick with list, add names, and iterate through list names):

year = c(2018,2019,2020)
region = c('AMER', 'APAC', 'EMEA')
columns = data.frame(year, region )

for (i in names(columns)) {
  print("out loop")
  print(i)
  for (j in columns[[i]]) {
    print("in loop")
    print(j)
  }
}

Upvotes: 1

Annet
Annet

Reputation: 866

Your required output does not make sense in combination with the given input (that is columns = list(year, region )).

You want to print (according to your required output) out loop 'region',however you do not have region or year in your list/df. Only the actual value of year and region are in there. This is caused by the way you create your list. It is not clear for me whether your actual data is nameless or actually does have a proper name. Even so, you cannot print something that is not in the data. To solve this issue I added the names:

names(columns) <- c("year","region")

or by simply doing that when creating the list:

columns = list(year = year, region =region )

When you now make i equal to the name of columns, you will get either year or region, which can be printed the way you specifed in the required output. However, as i is now equal to the name (rather than the values as in your example), you cannot select j in i. Well, technically you can, but it doesn't make sense because j would be equal to i. Instead, you want to select the values of columns for the list i. So as we change these both things in your for loops you will get:

 for (i in names(columns)){
     print("out loop ")
     print(i)
     for (j in columns[[i]]){
         print("in loop ")
         print(j)
     }
}

This will give:

[1] "out loop "
[1] "year"
[1] "in loop "
[1] 2018
[1] "in loop "
[1] 2019
[1] "in loop "
[1] 2020
[1] "out loop "
[1] "region"
[1] "in loop "
[1] "AMER"
[1] "in loop "
[1] "APAC"
[1] "in loop "
[1] "EMEA"

I do not know why you had print("in loop ") print(i) print(j) in the second loop of your original example, as that does not correspont with you required output (and also makes the print("out loop") print(i) a bit redundant IMO). You could still do that, but that would give:

[1] "out loop "
[1] "year"
[1] "in loop "
[1] "year"
[1] 2018
[1] "in loop "
[1] "year"
[1] 2019
[1] "in loop "
[1] "year"
[1] 2020
[1] "out loop "
[1] "region"
[1] "in loop "
[1] "region"
[1] "AMER"
[1] "in loop "
[1] "region"
[1] "APAC"
[1] "in loop "
[1] "region"
[1] "EMEA"

Some personal preferences though: I do not know exactly why you want to print this, but for loops are relatively slow, especially when you are nesting them or when you data becomes larger. You could easily apply the function simply by using apply or map and write the results into a new column or assign it as object to the environment.

Upvotes: 1

Related Questions