jo_
jo_

Reputation: 731

Create a new value based on summing specific columns in R

I have a dataframe:

dat <- data.frame(col1 = sample(0:3, 10, replace = TRUE),
                  col2 = sample(0:3, 10, replace = TRUE),
                  col3 = sample(0:3, 10, replace = TRUE),
                  col4 = sample(0:3, 10, replace = TRUE))

I want to create a new vector (outside of the dataframe) var that will state 1 if the sum of col3 and col4 is >= 4 and 0 otherwise. How can I do this? I tried using sum within an ifelse statement but it seems to produce a character output.

Any leads? Thanks!

Upvotes: 2

Views: 622

Answers (3)

AndrewGB
AndrewGB

Reputation: 16836

With dplyr, we can use mutate to create a new column (var) using rowSums and the condition of whether the sum of col3 and col4 is greater than or equal to 4. Here, I use + to convert from logical to 0 or 1. Then, we can use pull to get the vector for var.

library(tidyverse)

var <- dat %>% 
  mutate(var = +(rowSums(select(., c(col3:col4)), na.rm = TRUE) >= 4)) %>% 
  pull(var)

Output

[1] 1 1 1 0 0 1 1 1 0 0

Or another option is to use sum with c_across for each row:

var <- dat %>% 
  rowwise() %>% 
  mutate(var = +(sum(c_across(col3:col4), na.rm = TRUE) >= 4)) %>% 
  pull(var)

Upvotes: 1

John Garland
John Garland

Reputation: 513

In a more general way, you can also go the apply route with all sorts of further logic included in the defined function should such be needed...

apply(dat,1,FUN=function (x) {as.integer(sum(x[3:4], na.rm=TRUE)>= 4)})      

Upvotes: 1

akrun
akrun

Reputation: 886938

If there are NAs as well, then use rowSums with na.rm = TRUE

vec1 <- as.integer(rowSums(dat[3:4], na.rm = TRUE) >= 4)

Upvotes: 0

Related Questions