Reputation: 69
This is an extension of the StackOverflow question - Subset Data Based On Elements In List - which answered the problem of how to create a list of new dfs, each being constructed by subsetting the original one based on a grouping factor variable.
The challenge I am encountering is that i need to create the dfs using more than one grouping variable
To generalise the problem, I have created this toy dataset - which has as the response variable the daily amount of rain, and as classifiers the temperature range and the cloudiness of that day.
rain <- c(2, 0, 4, 25, 3, 9, 4, 0, 4, 0, 8, 35)
temp <- as.factor(c("Warm","Cold","Hot","Cold","Warm","Cold","Cold","Warm","Warm","Hot","Cold", "Cold"))
clouds <- as.factor(c("Some","Lots","None","Lots","None","None","Lots","Some","Some","Lots","None", "Some"))
df <- data.frame(rain, temp, clouds)
With the following code, i can produce three new dataframes grouped on the temp variable, all combined into a single list (df_1A):
temp_levels <- unique(as.character(df$temp))
df_1A <- lapply(temp_levels, function(x){subset(df, temp == x)})
And ditto for three new dataframes grouped by the cloudiness
cloud_levels <- unique(as.character(df$clouds))
df_1B <- lapply(cloud_levels, function(x){subset(df, clouds == x)})
However, I have not been able to come up with a simple, elegant way to produce the 9 dataframes each of which has a unique combination of temp and cloudiness
Thanks
Upvotes: 1
Views: 750
Reputation: 388982
You could use split
to divide data based on unique levels of temp
and clouds
.
df_1 <- split(df, list(df$temp, df$clouds))
Upvotes: 3
Reputation: 3923
Your question implies a preference for lapply
but if you don't mind using dplyr
there is an elegant solution.
library(dplyr)
df_list <-
df %>%
group_by(temp, clouds) %>%
group_split()
# df_list
df_list[[1]]
#> # A tibble: 3 x 3
#> rain temp clouds
#> <dbl> <fct> <fct>
#> 1 0 Cold Lots
#> 2 25 Cold Lots
#> 3 4 Cold Lots
Your data
rain <- c(2, 0, 4, 25, 3, 9, 4, 0, 4, 0, 8, 35)
temp <- as.factor(c("Warm","Cold","Hot","Cold","Warm","Cold","Cold","Warm","Warm","Hot","Cold", "Cold"))
clouds <- as.factor(c("Some","Lots","None","Lots","None","None","Lots","Some","Some","Lots","None", "Some"))
df <- data.frame(rain, temp, clouds)
Upvotes: 1