Rohit Rajagopal
Rohit Rajagopal

Reputation: 21

How to create a new variable by comparing two other variables

I have to create a variable named comorbidity based on the values present in two other variables and add it to the dataframe

Now I have dataframe named diagnosis with variables as following

v1 <- c(222,250,255,250.23)

v2 <- c(300,369,400,450)

now if v1 is not b/w 250-250.99 and v2 is not b/w 390-490 then value in new variable 0

if v1 is b/w 250-250.99 and v2 is not b/w 390-490 then value in new variable 1

if v1 is not b/w 250-250.99 and v2 is b/w 390-490 then value in new variable 2

if v1 is b/w 250-250.99 and v2 is b/w 390-490 then value in new variable 3

I have tried ifelse and written a huge code but it doesn't give desirable answer part of my code below

diabetic_maindf$comorbidity_1_2 <- 
if_else((diagnosis$v1 == diagnosis$v2 ),0,

if_else((diagnosis$v1 == 1 | diagnosis$v2 == 0),1, 

if_else((diagnosis$v1 == 0 & diagnosis$v2 == 1),1,

I want dataframe with 3rd variable as follows

v1 <- c(222,250,255,250.23)

v2 <- c(300,369,400,450)

new_var <- c(0,1,2,3)

P.S: new to here don't know how to write in table form, sorry....

Upvotes: 1

Views: 232

Answers (2)

thelatemail
thelatemail

Reputation: 93813

Doing this step by step shouldn't be required when interaction and similar functions exist:

interaction(v1 >= 250 & v1 <= 250.99, v2 >= 390 & v2 <= 490)
#[1] FALSE.FALSE TRUE.FALSE  FALSE.TRUE  TRUE.TRUE  
#Levels: FALSE.FALSE TRUE.FALSE FALSE.TRUE TRUE.TRUE

c(0,1,2,3)[interaction(v1 >= 250 & v1 <= 250.99, v2 >= 390 & v2 <= 490)]
#[1] 0 1 2 3

The good thing about this logic is it will expand to n comparisons while writing only n statements instead of 2^n explicit comparisons.

Upvotes: 4

mysteRious
mysteRious

Reputation: 4294

You can do this using a combination of floor and between with case_when:

v1 <- c(222,250,255,250.23)
v2 <- c(300,369,400,450)
diagnosis <- data.frame(v1=v1, v2=v2)

library(dplyr)
mutate(diagnosis, v3 = case_when(
  floor(v1) == 250 & between(v2,390,490) ~ 3,
  floor(v1) != 250 & between(v2,390,490) ~ 2,
  floor(v1) == 250 & !between(v2,390,490) ~ 1,
  floor(v1) != 250 & !between(v2,390,490) ~ 0
))

This produces:

      v1  v2 v3
1 222.00 300  0
2 250.00 369  1
3 255.00 400  2
4 250.23 450  3

If you want to save it to a new data frame, just add --> df to the end of the mutate statement after the ending )). The benefit of this approach is readability of code.

Upvotes: 2

Related Questions