Reputation: 153
I have a large data frame that has NA's at different point. I need to remove few rows that has more NA values.
I applied filter
using is.na()
conditions to remove them. However, they are not yielding fruitful results.
S.No MediaName KeyPress KPIndex Type Secs X Y
001 Dat NA 1 Fixation 18 117 89
002 New NA NA Saccade 33 NA NA
003 Dat NA 2 Fixation 23 117 NA
df <- df%>%filter(df, !is.na(KeyPress) & !is.na(KPIndex) & !is.na(X) & !is.na(Y))
I would want delete based on the conditions using dplyr. I have more rows similar to this in a large dataframe. What is wrong with my code?
Upvotes: 2
Views: 996
Reputation: 1057
I'll put my bid in for the simplest. Dplyr's drop_na will drop all rows that contain an NA.
data %>% drop_na()
You can specify which columns you want it to look at (or not with "-"), if that is relevant to your case.
Upvotes: 0
Reputation: 886938
If there are more than one column, use filter_at
library(dplyr)
df %>%
filter_at(vars(KeyPress, KPIndex, X, Y), any_vars(!is.na(.)))
Or with rowSums
from base R
nm1 <- c("KeyPress", "KPIndex", "X", "Y")
df[rowSums(!is.na(df[nm1]))!= 0,]
df <- structure(list(S.No = 1:3, MediaName = c("Dat", "New", "Dat"),
KeyPress = c(NA, NA, NA), KPIndex = c(1L, NA, 2L), Type = c("Fixation",
"Saccade", "Fixation"), Secs = c(18L, 33L, 23L), X = c(117L,
NA, 117L), Y = c(89L, NA, NA)), class = "data.frame", row.names = c(NA,
-3L))
Upvotes: 2
Reputation: 28826
You should use |
instead of &
:
library(dplyr)
df1 %>%
filter(!is.na(KeyPress) | !is.na(KPIndex) | !is.na(X) | !is.na(Y))
# S.No MediaName KeyPress KPIndex Type Secs X Y
# 1 1 Dat NA 1 Fixation 18 117 89
# 2 3 Dat NA 2 Fixation 23 117 NA
Upvotes: 2