User981636
User981636

Reputation: 3621

Break down data frame with for loop

I have a dataset similar to this:

   city var value
    a   var1    0.19
    b   var1    0.67
    c   var1    0.19
    a   var2    0.14
    b   var2    0.38
    c   var2    0.27
    a   var3    0.59
    b   var3    0.42
    c   var3    0.27
    a   var4    0.28
    b   var4    0.37
    c   var4    0.91

And I need to create different data frames of city b with the rest of the cities (e.g. city b with city a, city b with city c, etc.). It is quite important that city b appears first in all variables for some algebra operations I do later on.

Example city b with city a:

city    var value
b   var1    0.67
a   var1    0.19
b   var2    0.38
a   var2    0.14
b   var3    0.42
a   var3    0.59
b   var4    0.37
a   var4    0.28

Example city b with city c:

city    var value
b   var1    0.67
c   var1    0.19
b   var2    0.38
c   var2    0.27
b   var3    0.42
c   var3    0.27
b   var4    0.37
c   var4    0.91

I have tried the following (one of my first loops ever) but it didn't work. :

for (i in unique(df$city)) {
  paste0("cityb",i) <- (df[df$city %in% c("cityb", "i"), ])
}

Do you know why it is not working? Any help or advice is hugely appreciated.

Upvotes: 1

Views: 133

Answers (2)

akrun
akrun

Reputation: 886948

If the city column is not factor, you could do: (slight modification of Andrie's code:

lst <- lapply(letters[c(1, 3)], function(i) {
x1 <- rbind(dat[dat$city == "b", ], dat[dat$city == i, ])
indx <- seq(1, nrow(x1), by = 4) + rep(0:3, each = 2)
x1[indx, ]
}), paste0("dat", 1:2))

list2env(lst, envir=.GlobalEnv)
#<environment: R_GlobalEnv>

 str(dat1)
 #'data.frame': 8 obs. of  3 variables:
 #$ city : chr  "b" "a" "b" "a" ...
 #$ var  : chr  "var1" "var1" "var2" "var2" ...
 #$ value: num  0.67 0.19 0.38 0.14 0.42 0.59 0.37 0.28
 str(dat2)
 # 'data.frame':    8 obs. of  3 variables:
 # $ city : chr  "b" "c" "b" "c" ...
 # $ var  : chr  "var1" "var1" "var2" "var2" ...
 # $ value: num  0.67 0.19 0.38 0.27 0.42 0.27 0.37 0.91

Upvotes: 0

Andrie
Andrie

Reputation: 179398

Here you go:

Step 1: recreate your data

dat <- read.table(text="
city var value
a   var1    0.19
b   var1    0.67
c   var1    0.19
a   var2    0.14
b   var2    0.38
c   var2    0.27
a   var3    0.59
b   var3    0.42
c   var3    0.27
a   var4    0.28
b   var4    0.37
c   var4    0.91
", header=TRUE)

Step 2: Relevel city so that b is the first level. You will use this in step 3 to ensure the cities get ordered in the correct sequence.

dat$city <- relevel(dat$city, "b")

Step 3: Use lapply to create a list of data frames. The function you pass to lapply creates the subset (using a logic similar to what you tried in your question) and then sorts it making use of the order() function:

lapply(
  setdiff(levels(dat$city), "b"),
  function(i){
    ret <- dat[dat$city %in% c("b", i), ]
    ret[order(ret$var, ret$city), ]
  })

The result:

[[1]]
   city  var value
2     b var1  0.67
1     a var1  0.19
5     b var2  0.38
4     a var2  0.14
8     b var3  0.42
7     a var3  0.59
11    b var4  0.37
10    a var4  0.28

[[2]]
   city  var value
2     b var1  0.67
3     c var1  0.19
5     b var2  0.38
6     c var2  0.27
8     b var3  0.42
9     c var3  0.27
11    b var4  0.37
12    c var4  0.91

Upvotes: 2

Related Questions