Reputation: 71

Return name of empty columns from a list

I have a named list of data frames that all contain the same columns, but for some of these data frames some of these columns are empty. What Im hoping to return is the name of the data frame in the list, and the name(s) of the empty column.

The repex below mirrors the process I am using on the full problem

library(tidyverse)

data("diamonds") 

data1 <- diamonds 

data1$color <- NA

data1$price <- NA

data2 <- diamonds

data2$carat <- NA

data1$Type <- "data1"

data2$Type <- "data2"

data1%>%
  bind_rows(data2) -> dataFull

dataSplit <- split(dataFull, f = dataFull$Type)

for(i in dataSplit){
  
  which(sapply(dataSplit[[i]], function(x) all(is.na(x))))
  
}

My hope is to return something like

data1: price, color

data2: carat

I've tried the very basic for-loop included above, which are admittedly not my strong suit.

Upvotes: 0

Answers (3)

akrun

Reputation: 887691

Using select

library(dplyr)
library(purrr)
map(dataSplit, ~ .x %>% 
      select(where(~ all(is.na(.x)))) %>%
      names)
$data1
[1] "color" "price"

$data2
[1] "carat"

Or in base R

 lapply(dataSplit, \(x) names(x)[!colSums(!is.na(x))])
$data1
[1] "color" "price"

$data2
[1] "carat"

Upvotes: 0

br00t

Reputation: 1614

library(tidyverse)

data("diamonds") 

data1 <- diamonds 

data1$color <- NA

data1$price <- NA

data2 <- diamonds

data2$carat <- NA

data1$Type <- "data1"

data2$Type <- "data2"

data1%>%
  bind_rows(data2) -> dataFull

dataSplit <- split(dataFull, f = dataFull$Type)

lapply(dataSplit, function(x) {
  cn <- colnames(x)
  isempty <- apply(x, 2, function(col) is.na(col) |> all())
  cn[ isempty ]
})

$data1
[1] "color" "price"

$data2
[1] "carat"

Upvotes: 2

Allan Cameron

Reputation: 174393

Your sapply idea was right, but you need to subset the names of each data frame with the output. Also, since you are loading the tidyverse, you may as well use map instead of a loop for brevity:

map(dataSplit, ~ names(.x)[sapply(.x, \(x) all(is.na(x)))])
#> $data1
#> [1] "color" "price"
#> 
#> $data2
#> [1] "carat"

Upvotes: 4

Return name of empty columns from a list

Answers (3)

Related Questions