user4394417
user4394417

Reputation:

R - programmatically detect NA columns and return string

I have this vector of eligible columns for my script

cols <- c("country", "phone", "car")

And this dataframe

test <-
  data.frame(
    id = c(1, 2, 3),
    country = c("us", NA, "uk"),
    phone = c(1, 1, NA),
    car = c(NA, 0, 1)
  )

The goal is to create a new column with the result, where the condition will be based only on columns present in cols variable. In case that all values for id are NA, then res should be string nothing, if some of them are not NA, then I need to this colnames, in case that all columns are not NA then result should be string all.

result <-
  data.frame(
    id = c(1, 2, 3),
    country = c("us", NA, NA),
    phone = c(1, 1, NA),
    car = c(NA, NA, NA),
    res = c("country, phone", "phone", "nothing")
  )

I can do it only via case_when() function

mutate(
    res = case_when(
      !is.na(country) & is.na(phone) & is.na(car)  ~ "country",
      T ~ "?"
    )

Upvotes: 2

Views: 166

Answers (2)

Ronak Shah
Ronak Shah

Reputation: 388982

The data which you have shared is different (test and result). So we will start with result by removing the res column.

library(dplyr)
result$res <- NULL

result %>%
  mutate_all(as.character) %>%
  tidyr::pivot_longer(cols = cols) %>%
  group_by(id) %>%
  summarise(res = toString(name[!is.na(value)])) %>%
  type.convert() %>%
  left_join(res, by = 'id') %>%
   mutate(res = case_when(res == '' ~ 'nothing', 
                           stringr::str_count(result, ',') == 
                           (length(cols) - 1) ~ 'all',
                            TRUE ~ as.character(result)))


# A tibble: 3 x 5
#     id res            country phone car  
#  <dbl> <chr>          <fct>   <dbl> <lgl>
#1     1 country, phone us          1 NA   
#2     2 phone          NA          1 NA   
#3     3 nothing        NA         NA NA   

We get the data in long format, get the column names which have non-NA value for each ID. We then change the res column to "all" or "nothing" if there are all or 0 matches respectively.

Upvotes: 0

Miff
Miff

Reputation: 7941

You can do this in base R (rather than dplyr) using the code:

result$res <- apply(result[,cols],1, function(x){paste(cols[!is.na(x)], collapse=", ")})
result$res[results$res==""] <- "nothing"

Upvotes: 3

Related Questions