Reputation: 939
I would like from this tibble :
mydf <- tribble(
~siren_us, ~eff,
"A", 3,
"A", 2,
"B", 1,
"B", NA,
"C", 3,
"C", 5,
"C", 4,
"D", NA,
"E", 1,
"E", NA
)
get this one :
more precisely, I want to retrieve the rows from mydf
for which siren_us
exists at least twice and which have the eff
column equal to NA.
In my example, I don't want the row "D" in result
result <- tribble(
~siren_us, ~eff,
"B", NA,
"E", NA
)
I think it could be madewith dplyr instructions group_by
and filter
?
Many thanks in advance
Upvotes: 1
Views: 81
Reputation: 26218
Do this to get results as desired. Use &
within the filter statement.
dplyr method
mydf %>% group_by(siren_us) %>%
filter(is.na(eff) & n() >1)
# A tibble: 2 x 2
# Groups: siren_us [2]
siren_us eff
<chr> <dbl>
1 B NA
2 E NA
base R menthod
mydf[is.na(mydf$eff) & as.numeric(ave(mydf$siren_us, mydf$siren_us, FUN = length)) > 1,]
# A tibble: 2 x 2
siren_us eff
<chr> <dbl>
1 B NA
2 E NA
If you want to get all results with any NA, through base R, use this
mydf[ave(mydf$eff, mydf$siren_us, FUN = function(x){sum(is.na(x))}) >= 1 &
ave(mydf$siren_us, mydf$siren_us, FUN = length) > 1,]
# A tibble: 4 x 2
siren_us eff
<chr> <dbl>
1 B 1
2 B NA
3 E 1
4 E NA
Upvotes: 3
Reputation: 4487
Using group_by
& filter
from dplyr
package
library(dplyr)
mydf %>%
group_by(siren_us) %>%
filter(
# group that have at least two records
n() >= 2 &
# and at least one eff is.na
any(is.na(eff)))
Output
# A tibble: 4 x 2
# Groups: siren_us [2]
siren_us eff
<chr> <dbl>
1 B 1
2 B NA
3 E 1
4 E NA
Upvotes: 2