Reputation: 1
I am required to remove all observations where at least one of these variables Loading Date
, year of built
, vessel type
and cargo size
contain missing values.
anyNA(CW_data$`Loading Date`) #result is FALSE, which means there aren't missing values
anyNA(CW_data$`Year Built`) #result is TRUE, there are missing values
anyNA(CW_data$`Vessel Type`)#result is TRUE, there are missing values
anyNA(CW_data$`Cargo Size`)#result is TRUE, there are missing values
CW_data_noNA <- filter(CW_data, is.na('Year Built')==FALSE |
is.na('Vessel Type'==FALSE)|
is.na('Cargo Size')==FALSE |
is.na('Loading Date') == FALSE)
I tried with the above code, but the resulting dataset is identical to the original one. May someone explain what I am doing wrong? many thanks, LMC
Upvotes: 0
Views: 1914
Reputation: 163
If you want to use filter
you can do like this:
CW_data_noNA <- CW_data %>%
filter(!is.na(`Year Built`) & !is.na(`Vessel Type`) &
!is.na(`Cargo Size`) & !is.na(`Loading Date`)
)
When you have strange names in columns you need to use backticks ``. In general, I think it's better to avoid whitespaces for column names.
Regarding the code you provided, is.na
already returns a logical, so you can use the !is.na
instead of is.na() == FALSE
. The pipe %>%
also allows you to get a cleaner code!
Next time, try providing a reproducible example with your data or some sample data for better understanding.
Upvotes: 0
Reputation: 11981
You can use filter_at
:
CW_data_noNA <- filter_at(CW_data, vars('Year Built', 'Vessel Type', 'Cargo Size', 'Loading Date'),
all_vars(!is.na(.)))
If you want use filter
instead you can do this:
CW_data_noNA <- CW_data %>%
filter(!is.na('Year Built'), !is.na('Vessel Type'),
!is.na('Cargo Size'), !is.na('Loading Date'))
This keeps all rows where none of the four columns is NA
.
Inside filter
various conditions are always concatenated using &
.
If you instead want to keep those row where not all four columns are NA
simultaneously use:
W_data %>%
filter(!is.na('Year Built') | !is.na('Vessel Type') |
!is.na('Cargo Size') | is.na('Loading Date'))
Upvotes: 2