Nick
Nick

Reputation: 1

How to loop dataframes by name in R

I have trouble looping dataframes by their name. And have no idea how to fix this. I am using census data from multiple years and have to apply the same operations over multiple datasets.

Here is a simplified example of that I want to do. I create a dataset called df1. And make two copies of it called df2 and df3. Let’s say that for each data frame I want to make a variable 3 (v3) which is v3=v1+v2.

The loop that I made won’t work. And I don’t know how to loop date frames correctly by their name.

v1<-c(1:10)
v2<-c(1:10)
df1<-data.frame(v1,v2)
df2<-df1
df3<-df1
x<-c("df1","df2","df3")
for (i in x) {v3<-v1+v2}

Upvotes: 0

Views: 62

Answers (1)

David
David

Reputation: 10232

What you are looking for is the get and assign-functions. For example you can use it like this:

df1 <- data.frame(v1 = 1:10, v2 = 1:10)
df2 <- df1
df3 <- df1

x <- c("df1","df2","df3")

for (i in x) {
  # load the dataset "i" to the tmp-variable
  tmp <- get(i)
  # do something
  tmp$v3 <- tmp$v1 + tmp$v2
  # assign the tmp variable to the value of "i" again
  assign(i, tmp)
}

# lets have a check
df1
#>    v1 v2 v3
#> 1   1  1  2
#> 2   2  2  4
#> 3   3  3  6
#> 4   4  4  8
#> 5   5  5 10
#> 6   6  6 12
#> 7   7  7 14
#> 8   8  8 16
#> 9   9  9 18
#> 10 10 10 20

df2
#>    v1 v2 v3
#> 1   1  1  2
#> 2   2  2  4
#> 3   3  3  6
#> 4   4  4  8
#> 5   5  5 10
#> 6   6  6 12
#> 7   7  7 14
#> 8   8  8 16
#> 9   9  9 18
#> 10 10 10 20

df3
#>    v1 v2 v3
#> 1   1  1  2
#> 2   2  2  4
#> 3   3  3  6
#> 4   4  4  8
#> 5   5  5 10
#> 6   6  6 12
#> 7   7  7 14
#> 8   8  8 16
#> 9   9  9 18
#> 10 10 10 20

Improvement

Having said that, you probably dont want to do it this way, instead try to use the apply-family of commands.

I usually tend to use a lot the lapply-function. In your case it would look like this:

# create some data again
df <- data.frame(v1 = 1:10, v2 = 1:10)

# create three data-frames in a list
# here you would for example, load the dataframes from your source into the list
df_list <- lapply(1:3, function(x) df)

str(df_list)
#> List of 3
#>  $ :'data.frame':    10 obs. of  2 variables:
#>   ..$ v1: int [1:10] 1 2 3 4 5 6 7 8 9 10
#>   ..$ v2: int [1:10] 1 2 3 4 5 6 7 8 9 10
#>  $ :'data.frame':    10 obs. of  2 variables:
#>   ..$ v1: int [1:10] 1 2 3 4 5 6 7 8 9 10
#>   ..$ v2: int [1:10] 1 2 3 4 5 6 7 8 9 10
#>  $ :'data.frame':    10 obs. of  2 variables:
#>   ..$ v1: int [1:10] 1 2 3 4 5 6 7 8 9 10
#>   ..$ v2: int [1:10] 1 2 3 4 5 6 7 8 9 10

# do some operations:
df_list2 <- lapply(df_list, function(d) {
  # do something
  d$v3 <- d$v1 + 100 * d$v2
  return(d)
})

df_list2
#> [[1]]
#>    v1 v2   v3
#> 1   1  1  101
#> 2   2  2  202
#> 3   3  3  303
#> 4   4  4  404
#> 5   5  5  505
#> 6   6  6  606
#> 7   7  7  707
#> 8   8  8  808
#> 9   9  9  909
#> 10 10 10 1010
#> 
#> [[2]]
#>    v1 v2   v3
#> 1   1  1  101
#> 2   2  2  202
#> 3   3  3  303
#> 4   4  4  404
#> 5   5  5  505
#> 6   6  6  606
#> 7   7  7  707
#> 8   8  8  808
#> 9   9  9  909
#> 10 10 10 1010
#> 
#> [[3]]
#>    v1 v2   v3
#> 1   1  1  101
#> 2   2  2  202
#> 3   3  3  303
#> 4   4  4  404
#> 5   5  5  505
#> 6   6  6  606
#> 7   7  7  707
#> 8   8  8  808
#> 9   9  9  909
#> 10 10 10 1010

Upvotes: 2

Related Questions