Reputation: 1293
I need to loop over factor levels in an R data.frame. Inside the loop I need to do operations for data.frames that include subsets defined by pairs of these levels. The pairs are two consecutive unique levels of that factor.
Here is an example of what I've tried:
require(dplyr)
df <- data.frame(fac = rep(c("A", "B", "C"), 3))
for(i in levels(fac)){
if(i != levels(fac)[length(levels(fac))]){
df %>% filter(fac %in% c(i, i + 1))
}
}
I try to have level i
and its subsequent level included but obviously expression i + 1
won't do the trick. How to get around this? Do I have to make variable fac
numerical or is there a neater solution available?
EDIT: The output (for this example) should be these two data.frames:
dfAB <- df %>% filter(fac %in% c("A", "B"))
dfBC <- df %>% filter(fac %in% c("B", "C"))
Upvotes: 5
Views: 16987
Reputation: 39154
A solution without loop.
library(dplyr)
# Create example data frame
df <- data.frame(fac = rep(c("A", "B", "C"), 3),
stringsAsFactors = TRUE)
# Create all the combinations of factor
m <- combn(unique(df$fac), m = 2)
# Check the difference between factor level, only keep those differ by 1
# Create a data frame with the right combination
re <- which(as.numeric(m[2, ]) - as.numeric(m[1, ]) != 1)
m2 <- as.data.frame.matrix(m[, -re])
# Filter df by m2
df_final <- lapply(m2, function(col){
df %>% filter(fac %in% col)
})
df_final
# $V1
# fac
# 1 A
# 2 B
# 3 A
# 4 B
# 5 A
# 6 B
#
# $V2
# fac
# 1 B
# 2 C
# 3 B
# 4 C
# 5 B
# 6 C
Upvotes: 1
Reputation: 7724
The problem is, that you loop over all levels of fac, which is a character vector and thus R
can not add 1 to i
.
The following works:
library(dplyr)
df <- data.frame(fac = rep(c("A", "B", "C"), 3))
df <- df %>%
mutate(fac = factor(fac, levels = c("A", "B", "C")))
for(i in seq_along(levels(df$fac))){
if(i != length(levels(df$fac))){
df %>% filter(fac %in% c(levels(fac)[i], levels(fac)[i+1])) %>% print()
}
}
# fac
# 1 A
# 2 B
# 3 A
# 4 B
# 5 A
# 6 B
# fac
# 1 B
# 2 C
# 3 B
# 4 C
# 5 B
# 6 C
The fac
column has to be a factor
(otherwise the filtering doesnh't work).
I added the print()
inside the loop to print the result, but you probably want to store it somewhere (e.g. in a list).
Upvotes: 7