Break down data frame with for loop

Question

I have a dataset similar to this:

   city var value
    a   var1    0.19
    b   var1    0.67
    c   var1    0.19
    a   var2    0.14
    b   var2    0.38
    c   var2    0.27
    a   var3    0.59
    b   var3    0.42
    c   var3    0.27
    a   var4    0.28
    b   var4    0.37
    c   var4    0.91

And I need to create different data frames of city b with the rest of the cities (e.g. city b with city a, city b with city c, etc.). It is quite important that city b appears first in all variables for some algebra operations I do later on.

Example city b with city a:

city    var value
b   var1    0.67
a   var1    0.19
b   var2    0.38
a   var2    0.14
b   var3    0.42
a   var3    0.59
b   var4    0.37
a   var4    0.28

Example city b with city c:

city    var value
b   var1    0.67
c   var1    0.19
b   var2    0.38
c   var2    0.27
b   var3    0.42
c   var3    0.27
b   var4    0.37
c   var4    0.91

I have tried the following (one of my first loops ever) but it didn't work. :

for (i in unique(df$city)) {
  paste0("cityb",i) <- (df[df$city %in% c("cityb", "i"), ])
}

Do you know why it is not working? Any help or advice is hugely appreciated.

Andrie · Accepted Answer

Here you go:

Step 1: recreate your data

dat <- read.table(text="
city var value
a   var1    0.19
b   var1    0.67
c   var1    0.19
a   var2    0.14
b   var2    0.38
c   var2    0.27
a   var3    0.59
b   var3    0.42
c   var3    0.27
a   var4    0.28
b   var4    0.37
c   var4    0.91
", header=TRUE)

Step 2: Relevel city so that b is the first level. You will use this in step 3 to ensure the cities get ordered in the correct sequence.

dat$city <- relevel(dat$city, "b")

Step 3: Use lapply to create a list of data frames. The function you pass to lapply creates the subset (using a logic similar to what you tried in your question) and then sorts it making use of the order() function:

lapply(
  setdiff(levels(dat$city), "b"),
  function(i){
    ret <- dat[dat$city %in% c("b", i), ]
    ret[order(ret$var, ret$city), ]
  })

The result:

[[1]]
   city  var value
2     b var1  0.67
1     a var1  0.19
5     b var2  0.38
4     a var2  0.14
8     b var3  0.42
7     a var3  0.59
11    b var4  0.37
10    a var4  0.28

[[2]]
   city  var value
2     b var1  0.67
3     c var1  0.19
5     b var2  0.38
6     c var2  0.27
8     b var3  0.42
9     c var3  0.27
11    b var4  0.37
12    c var4  0.91

Break down data frame with for loop

Answers (2)

Related Questions