Reputation: 101
If I have a df:
letter body_part
a head
b head
c NA
d NA
e left_foot
And I want to split it into 2 dfs... One with only body_part - "head" and the other with everything else. I.e.
list <- split(df, df$body_part == 'head')
Can I do that without dropping the NA rows? (I know I can do it if I fill the NAs with a string, but is there a way that avoids that step?)
Upvotes: 2
Views: 1050
Reputation: 47350
You can convert the f
argument of split()
to factor while not exluding the NA
values.
df <- read.table(h= T, strin = F, text = "
letter body_part
a head
b head
c NA
d NA
e left_foot")
split(df, factor(df$body_part,exclude = NULL))
#> $head
#> letter body_part
#> 1 a head
#> 2 b head
#>
#> $left_foot
#> letter body_part
#> 5 e left_foot
#>
#> $<NA>
#> letter body_part
#> 3 c <NA>
#> 4 d <NA>
split(df, factor(df$body_part,exclude = NULL) == 'head')
#> $`FALSE`
#> letter body_part
#> 3 c <NA>
#> 4 d <NA>
#> 5 e left_foot
#>
#> $`TRUE`
#> letter body_part
#> 1 a head
#> 2 b head
Created on 2019-10-14 by the reprex package (v0.3.0)
Upvotes: 1
Reputation: 93938
From ?`%in%`
:
That ‘%in%’ never returns ‘NA’ makes it particularly useful in ‘if’ conditions.
# just to show how the `==` comparison compares
> df$s_col <- df$body_part == 'head'
> split(df, df$body_part %in% 'head')
$`FALSE`
letter body_part s_col
3 c <NA> NA
4 d <NA> NA
5 e left_foot FALSE
$`TRUE`
letter body_part s_col
1 a head TRUE
2 b head TRUE
Upvotes: 5
Reputation: 342
> ind <- df$body_part == 'head'
> ind[is.na(ind)] <- FALSE
> split(df, ind)
$`FALSE`
# A tibble: 3 x 2
letter body_part
<chr> <chr>
1 c <NA>
2 d <NA>
3 e left_foot
$`TRUE`
# A tibble: 2 x 2
letter body_part
<chr> <chr>
1 a head
2 b head
Upvotes: 0