Tim
Tim

Reputation: 5

Mutate with condition on ALL other columns to add a new column

First question ever for me! Hope I make it clear and simple :D

Subject

I want to add a new column which content if based on a condition based on all other columns content.

Data and wanted transformation

I have the following (kinda messy) data-frame and I'm trying to add a column that should contain "Yes" if one other column contains 0 or "No" if none of the other columns contains "0".

library(tidyverse)
library(readxl)
all_extract <- read_excel("all_extract.xlsx")
View(all_extract)                                                                                                                                 
all_extract
A tibble: 25 x 14
     tag1 `tag1 -1`  tag2 `tag2 -1`  tag3 `tag3 -1`  tag4 `tag4 -1`  tag5 `tag5 -1`  tag6 `tag6 -1`  tag7 `tag7-1`
     <dbl> <lgl>     <dbl> <lgl>     <dbl> <lgl>     <dbl> <dbl>     <dbl> <dbl>     <dbl> <lgl>     <dbl>  <dbl>
1     2    NA        0     NA        1     NA        2     NA         1    -1        1     NA         1     0
2     2    NA        2     NA        1     NA        2     NA         2    NA        1     NA         2     0
3     1    NA        1     NA        1     NA        2     NA         1    NA        2     NA         1     0
4     1    NA        2     NA        1     NA        2     NA         1    NA        1     NA         1     0
5     2    NA        1     NA        1     NA        2     NA         1    NA        2     NA         1     0
6     2    NA        1     NA        1     NA        2     NA         0    NA        1     NA         2     0
7     1    NA        2     NA        1     NA        2     NA         0    NA        1     NA         1     0
8     1    NA        1     NA        2     NA        2     NA         1    NA        1     NA         1     0
9     1    NA        2     NA        1     NA        2     NA         1    NA        2     NA         1     0
10    1    NA        2     NA        1     NA        2     -1         1    NA        2     NA         1     0
# … with 15 more rows

For replication, file can be download with this link.

Attempts and issues

I've tried to do it with dplyr mutate, but it works with one or several named columns, not with all columns; I need to write a script that I can apply to a bunch of files that have not the same number of columns. If my understanding is correct, mutate_all and mutate_if do not do what I want, as unlike mutate they do not add a new column but change the content of the existing one that match the criteria. Yet, they are the only thing I can find when searching for help on the internet.

For instance, I can apply mutate with a condition on one column:

all <- all_extract %>%
    mutate(Final_schwa = case_when(syll_tag2==0 ~ 'Schwa',
        TRUE ~ 'No schwa'))

I have find many solution for multiple columns, none for all columns.

I also tried to use across within mutate, but I don't know how to indicate all columns. Ideally, I thought I could use something like:

mutate(across(starts_with("tag"), "Yes"))

But I cannot managed to get the right syntax for it to work. Also, I'm not even sure across is supposed to be able to do what I want.

Summary

So: is there a way to apply real mutate (for adding a column) with a condition based on all other columns (for the new column to be filled with "Yes/No" content based on the all-columns condition)?

Thanks in advance if you could help!

Upvotes: 0

Views: 167

Answers (2)

Captain Hat
Captain Hat

Reputation: 3257

Use pivot_longer

As Konrad has pointed out, this solution is superior to the one below because it preserves data types - if your data contains multiple types, they would be coerced into a single type by t which could result in data loss.

new_col <- 
all_extract %>%
pivot_longer(everything()) %>% #This will make a 2-col df w/ names: (name, value)
group_by(name) %>% 
summarise(any_zero = any(value == 0))

rbind(all_extract, new_col["any_zero"])

Or transpose the data and summarise(across(everything))

I think this solution is a little more intuitive if you're a visual thinker, but there's a risk of data loss, because t returns a matrix, and matrices can't contain multiple types.

new_col <- 
all_extract %>%
t %>%
as_tibble %>% # (because t() returns a matrix)
summarise(across(everything(), function(x){
any(x == 0) 
})) %>%
t # transposing again will coerce our single row df into a column vector

all_extract$new_col <- new_col

Upvotes: 2

Ronak Shah
Ronak Shah

Reputation: 389235

Do you want to add 'Yes' if there is atleast 1 zero in a row?

You can use rowSums as :

all_extract$new_col <- ifelse(rowSums(all_extract == 0, na.rm = TRUE) > 0, 'yes', 'no')

#Without ifelse
all_extract$new_col <- c('No', 'Yes')[(rowSums(all_extract == 0, na.rm = TRUE) > 0) + 1]

Upvotes: 1

Related Questions