Reputation: 11

How to remove columns full of only NA values

Here is an example of the output when I execute the code chunk "is.na() function.

   start_lat start_lng end_lat end_lng member_casual ride_length day_of_week    X  X.1  X.2
[1,]     FALSE     FALSE   FALSE   FALSE         FALSE       FALSE       FALSE TRUE TRUE TRUE
[2,]     FALSE     FALSE   FALSE   FALSE         FALSE       FALSE       FALSE TRUE TRUE TRUE
[3,]     FALSE     FALSE   FALSE   FALSE         FALSE       FALSE       FALSE TRUE TRUE TRUE

The "x", "x.1", and "x.2" columns are added to my dataframe and I don't know where they came from. I used na.omit function, but the columns are not recognized. In other words, they are not valid names. Can someone please help me remove these columns in my dataframe.

Upvotes: 1

Answers (2)

Captain Hat

Reputation: 3257

A tidyverse solution

Using dlpyr::select()

Make some dummy data:

require(dplyr)

myData <- tibble(a = c(1,2,3,4), b = c("a", "b", "c", "d"),
                 c = c(NA, NA, NA, NA), d = c(NA, "not_na", "not_na", NA))

myData
#> # A tibble: 4 x 4
#>       a b     c     d     
#>   <dbl> <chr> <lgl> <chr> 
#> 1     1 a     NA    <NA>  
#> 2     2 b     NA    not_na
#> 3     3 c     NA    not_na
#> 4     4 d     NA    <NA>

Select only the rows that are not all NA


myNewData <- select(myData, where(function(x) !all(is.na(x))))

myNewData
#> # A tibble: 4 x 3
#>       a b     d     
#>   <dbl> <chr> <chr> 
#> 1     1 a     <NA>  
#> 2     2 b     not_na
#> 3     3 c     not_na
#> 4     4 d     <NA>

^{Created on 2022-02-16 by the reprex package (v2.0.1)}

Upvotes: 1

Gregor Thomas

Reputation: 146119

## figure out which columns are all NA values
all_na_cols = sapply(your_data, \(x) all(is.na(x)))

## drop them
your_data = your_data[!all_na_cols]

Running na.omit() on a data frame will drop rows if they have one or more NA values in them, so not what you want to do here.

The "x", "x.1", and "x.2" columns are added to my dataframe and I don't know where they came from.

That would worry me a lot. If I were you, I'd go back in your script and run it one line at a time until I found out where those columns came from, and then I'd solve the source of problem there rather than putting a bandage on it here.

Upvotes: 2

How to remove columns full of only NA values

Answers (2)

A tidyverse solution

Make some dummy data:

Select only the rows that are not all NA

Related Questions