Reputation: 1829
I have the following data frame lets call it df
, with the following observations:
id type company
1 NA NA
2 NA ADM
3 North Alex
4 South NA
NA North BDA
6 NA CA
I want to retain only the records which do not have NA in column "type" and "company".
id type company
3 North Alex
NA North BDA
I tried:
df_non_na <- df[!is.na(df$company) || !is.na(df$type), ]
But this did not work.
Thanks in advance
Upvotes: 23
Views: 78912
Reputation: 155
The example with dplyr
(version >= 1.0.4) and if_all()
, since filter_at()
is superseded
id <- c(1, 2, 3, 4, NA, 6)
type <- c(NA, NA, "North", "South", "North", NA)
company <- c(NA, "ADM", "Alex", NA, "BDA", "CA")
df <- tibble(id, type, company)
library(dplyr)
df_non_na <- df %>% filter(if_all(c(type,company), ~ !is.na(.)))
Upvotes: 10
Reputation: 410
You'll want to use drop_na()
library(dplyr)
new_df <- df %>%
drop_na(type, company)
Upvotes: 17
Reputation: 781
Using dplyr, you can also use the filter_at
function
library(dplyr)
df_non_na <- df %>% filter_at(vars(type,company),all_vars(!is.na(.)))
all_vars(!is.na(.))
means that all the variables listed need to be not NA.
If you want to keep rows that have at least one value, you could do:
df_non_na <- df %>% filter_at(vars(type,company),any_vars(!is.na(.)))
Upvotes: 45
Reputation: 505
You need AND operator (&), not OR (|) I also strongly suggest the tidyverse approach by using the dplyr function filter() and the pipe operator %>%, from dplyr as well:
library(dplyr)
df_not_na <- df %>% filter(!is.na(company) & !is.na(type))
Upvotes: 7
Reputation: 887851
We can get the logical index for both columns, use &
and subset the rows.
df1[!is.na(df1$type) & !is.na(df1$company),]
# id type company
#3 3 North Alex
#5 NA North BDA
Or use rowSums
on the logical matrix (is.na(df1[-1])
) to subset.
df1[!rowSums(is.na(df1[-1])),]
Upvotes: 22