sabc04
sabc04

Reputation: 191

How to create a new variable in R based on values from two other columns?

I have a dataset that looks like this:

Current Dataset

Where each patient has four rows, for both of their ears, at two timepoints. I want to create a new variable that takes from the first row of chemo dose 1, and the second row of chemo dose 2. My desired output is something like this:

Desired Output

How can I create a variable like this in R?

Upvotes: 0

Views: 249

Answers (3)

Quinten
Quinten

Reputation: 41285

data.table option using fifelse (thanks @langtang for creating the data):

library(data.table)
setDT(df)[, new := fifelse(Time_Point == "C1", Chemo_Dose1, Chemo_Dose2)]
df

Output:

         Ear Study_ID Chemo_Dose1 Chemo_Dose2 Time_Point  new
1:  Left Ear  CF41853        1200         300         C1 1200
2:  Left Ear  CF41853        1200         300       Post  300
3: Right Ear  CF41854        1200         300         C1 1200
4: Right Ear  CF41854        1200         300       Post  300

Upvotes: 1

Yacine Hajji
Yacine Hajji

Reputation: 1449

Would this answer to your problematic?

### 1- data simulation
df <- data.frame(dose1=rep(1200, 4), dose2=rep(300, 4), time=c("C1", "Post", "C1", "Post"))

### 2- computing new variable based on time endpoint
df$newVariable <- ifelse(df$time=="C1", df$dose1, df$dose2)

Upvotes: 1

langtang
langtang

Reputation: 24722

Can you simply mutate(), using if_else()?

library(dplyr)

df %>% mutate(NEW_VARIABLE = if_else(Time_Point=="C1", Chemo_Dose1,Chemo_Dose2))

Output:

        Ear Study_ID Chemo_Dose1 Chemo_Dose2 Time_Point NEW_VARIABLE
1  Left Ear  CF41853        1200         300         C1         1200
2  Left Ear  CF41853        1200         300       Post          300
3 Right Ear  CF41854        1200         300         C1         1200
4 Right Ear  CF41854        1200         300       Post          300

Input:

structure(list(Ear = c("Left Ear", "Left Ear", "Right Ear", "Right Ear"
), Study_ID = c("CF41853", "CF41853", "CF41854", "CF41854"), 
    Chemo_Dose1 = c(1200, 1200, 1200, 1200), Chemo_Dose2 = c(300, 
    300, 300, 300), Time_Point = c("C1", "Post", "C1", "Post"
    )), class = "data.frame", row.names = c(NA, -4L))

Upvotes: 3

Related Questions