Reputation: 3621
I have a dataset similar to this:
city var value
a var1 0.19
b var1 0.67
c var1 0.19
a var2 0.14
b var2 0.38
c var2 0.27
a var3 0.59
b var3 0.42
c var3 0.27
a var4 0.28
b var4 0.37
c var4 0.91
And I need to create different data frames of city b with the rest of the cities (e.g. city b with city a, city b with city c, etc.). It is quite important that city b appears first in all variables for some algebra operations I do later on.
Example city b with city a:
city var value
b var1 0.67
a var1 0.19
b var2 0.38
a var2 0.14
b var3 0.42
a var3 0.59
b var4 0.37
a var4 0.28
Example city b with city c:
city var value
b var1 0.67
c var1 0.19
b var2 0.38
c var2 0.27
b var3 0.42
c var3 0.27
b var4 0.37
c var4 0.91
I have tried the following (one of my first loops ever) but it didn't work. :
for (i in unique(df$city)) {
paste0("cityb",i) <- (df[df$city %in% c("cityb", "i"), ])
}
Do you know why it is not working? Any help or advice is hugely appreciated.
Upvotes: 1
Views: 133
Reputation: 886948
If the city
column is not factor
, you could do: (slight modification of Andrie's
code:
lst <- lapply(letters[c(1, 3)], function(i) {
x1 <- rbind(dat[dat$city == "b", ], dat[dat$city == i, ])
indx <- seq(1, nrow(x1), by = 4) + rep(0:3, each = 2)
x1[indx, ]
}), paste0("dat", 1:2))
list2env(lst, envir=.GlobalEnv)
#<environment: R_GlobalEnv>
str(dat1)
#'data.frame': 8 obs. of 3 variables:
#$ city : chr "b" "a" "b" "a" ...
#$ var : chr "var1" "var1" "var2" "var2" ...
#$ value: num 0.67 0.19 0.38 0.14 0.42 0.59 0.37 0.28
str(dat2)
# 'data.frame': 8 obs. of 3 variables:
# $ city : chr "b" "c" "b" "c" ...
# $ var : chr "var1" "var1" "var2" "var2" ...
# $ value: num 0.67 0.19 0.38 0.27 0.42 0.27 0.37 0.91
Upvotes: 0
Reputation: 179398
Here you go:
Step 1: recreate your data
dat <- read.table(text="
city var value
a var1 0.19
b var1 0.67
c var1 0.19
a var2 0.14
b var2 0.38
c var2 0.27
a var3 0.59
b var3 0.42
c var3 0.27
a var4 0.28
b var4 0.37
c var4 0.91
", header=TRUE)
Step 2: Relevel city so that b
is the first level. You will use this in step 3 to ensure the cities get ordered in the correct sequence.
dat$city <- relevel(dat$city, "b")
Step 3: Use lapply
to create a list of data frames. The function you pass to lapply
creates the subset (using a logic similar to what you tried in your question) and then sorts it making use of the order()
function:
lapply(
setdiff(levels(dat$city), "b"),
function(i){
ret <- dat[dat$city %in% c("b", i), ]
ret[order(ret$var, ret$city), ]
})
The result:
[[1]]
city var value
2 b var1 0.67
1 a var1 0.19
5 b var2 0.38
4 a var2 0.14
8 b var3 0.42
7 a var3 0.59
11 b var4 0.37
10 a var4 0.28
[[2]]
city var value
2 b var1 0.67
3 c var1 0.19
5 b var2 0.38
6 c var2 0.27
8 b var3 0.42
9 c var3 0.27
11 b var4 0.37
12 c var4 0.91
Upvotes: 2