Reputation: 827
I have a data frame with 3 cols.
ID1 <- c(1,1,2,2,3,4)
ID2 <- c(11,NA,12,NA,NA,NA)
Val <- c("A","B","C","D","E","F")
DF <- data.frame(ID1,ID2,Val, stringsAsFactors=FALSE)
Now, I need to extract unique rows which have ID2 as "NA". In this case, desired output will be data frame with two rows i.e. ID1 = 3,4. I tried below subset command which results into all the four rows with NA. Looking for ways to achieve the desired output.
DF2 <- subset(DF , is.na(ID2))
Upvotes: 0
Views: 1040
Reputation: 4940
If by unique rows, you mean unique values of ID1
, then this code makes the trick:
DF[which(!duplicated(DF$ID1) & is.na(DF$ID2)),]
ID1 ID2 Val
5 3 NA E
6 4 NA F
If you prefer using subset
, then this code gives the same output:
subset(DF , !duplicated(ID1) & is.na(ID2))
Upvotes: 1
Reputation: 1850
Define a function to look up ID1
groups which have all NAs in ID2
, and then return
the unique rows of them.
library(dplyr)
select_na <- function(df_sub) {
if (any(!is.na(df_sub$ID2))) {
return(df_sub[0,])
}
else {
return(unique(df_sub))
}
}
DF %>%
group_by(ID1) %>%
do(select_na(.))
gives exactly what you want.
Upvotes: 0
Reputation: 6768
Try:
library(dplyr)
DF %>%
group_by(ID1) %>%
filter(n() == 1 & is.na(ID2))
Upvotes: 1