Reputation: 497
I have a data frame like this:
df <- data.frame(x = c(0,0,1,1,2,2,3,3,4,4,5,5), y = c(0,1,1,1,0,0,0,1,1,1,0,0))
How can I split the data into two data frames, where for each x
value, both y
values are equal to 1?
df
x y
1 0 0
2 0 1
3 1 1 # x = 1: all y = 1
4 1 1 #
5 2 0
6 2 0
7 3 0
8 3 1
9 4 1 # x = 4: all y = 1
10 4 1 #
11 5 0
12 5 0
The two resulting data frames would then look like:
df1 <- data.frame(x = c(1,1,4,4), y = c(1,1,1,1))
df1
x y
1 1 1
2 1 1
3 4 1
4 4 1
df2 <- data.frame(x = c(0,0,2,2,3,3,5,5), y = c(0,1,0,0,0,1,0,0))
df2
x y
1 0 0
2 0 1
3 2 0
4 2 0
5 3 0
6 3 1
7 5 0
8 5 0
Upvotes: 5
Views: 331
Reputation: 102700
One base R split
variant.
If your y
column consists of 0
and 1
only, then you can run the code below (thanks @Henrik)
> split(df, ~ave(y, x) == 1)
$`FALSE`
x y
1 0 0
2 0 1
5 2 0
6 2 0
7 3 0
8 3 1
11 5 0
12 5 0
$`TRUE`
x y
3 1 1
4 1 1
9 4 1
10 4 1
otherwise, in general cases of y
, we can try
> split(df, ~ ave(y == 1, x) == 1)
$`FALSE`
x y
1 0 0
2 0 1
5 2 0
6 2 0
7 3 0
8 3 1
11 5 0
12 5 0
$`TRUE`
x y
3 1 1
4 1 1
9 4 1
10 4 1
Upvotes: 3
Reputation: 16876
Here is a data.table
option:
library(data.table)
library(dplyr)
setDT(df)[, group := all(y == 1), by = x] %>%
split(., by = "group", keep.by = FALSE)
Output
$`FALSE`
x y
1: 0 0
2: 0 1
3: 2 0
4: 2 0
5: 3 0
6: 3 1
7: 5 0
8: 5 0
$`TRUE`
x y
1: 1 1
2: 1 1
3: 4 1
4: 4 1
Upvotes: 1
Reputation: 25528
Another possible solution, based on dplyr
:
library(dplyr)
df<-data.frame(x=c(0,0,1,1,2,2,3,3,4,4,5,5),y=c(0,1,1,1,0,0,0,1,1,1,0,0))
df1 <- df %>%
group_by(x) %>%
filter(all(y == 1)) %>%
ungroup
df2 <- df %>%
anti_join(df1, by = c("x", "y"))
list(df1, df2 %>% as_tibble)
#> [[1]]
#> # A tibble: 4 × 2
#> x y
#> <dbl> <dbl>
#> 1 1 1
#> 2 1 1
#> 3 4 1
#> 4 4 1
#>
#> [[2]]
#> # A tibble: 8 × 2
#> x y
#> <dbl> <dbl>
#> 1 0 0
#> 2 0 1
#> 3 2 0
#> 4 2 0
#> 5 3 0
#> 6 3 1
#> 7 5 0
#> 8 5 0
Upvotes: 2
Reputation: 79338
in base R:
split(df, ~ave(y == 1, x, FUN = all))
$`FALSE`
x y
1 0 0
2 0 1
5 2 0
6 2 0
7 3 0
8 3 1
11 5 0
12 5 0
$`TRUE`
x y
3 1 1
4 1 1
9 4 1
10 4 1
In tidyverse:
library(tidyverse)
df %>%
group_by(x) %>%
mutate(s = all(y==1))%>%
ungroup() %>%
group_split(s, .keep = FALSE)
[[1]]
# A tibble: 8 x 2
x y
<dbl> <dbl>
1 0 0
2 0 1
3 2 0
4 2 0
5 3 0
6 3 1
7 5 0
8 5 0
[[2]]
# A tibble: 4 x 2
x y
<dbl> <dbl>
1 1 1
2 1 1
3 4 1
4 4 1
Upvotes: 7
Reputation: 2262
Thus:
library(dplyr)
df <- df |> group_by(x) |> mutate(all_y_1 = all(y==1))
df1 <- df |> filter(all_y_1) |> select(-all_y_1)
df2 <- df |> filter(! all_y_1) |> select(-all_y_1)
Upvotes: 1