Create new column in R dataframe based on results from 3 other columns

Question

I have a dataframe containing Id and scan results. 1 denoted if a result not seen on a scan. 2 if a result seen and no vector if scan not completed.

I wish to create one column at the end of the dataframe which checks all 3 columns and returns a "2" if result ever seen in any of the 3 scans. "1" if result not seen on a scan and no vector if patient never had a scan completed on any three modalities.

Basically result "2" is the dominant vector - if it appears in dataframe row I want it to shown in new column
if "2" not present then if "1" present that needs to appear in new column
if no result in any column then, no result or NA to appear

I have tried doing this in Excel and R. I would prefer to use R as I am learning this at the moment and want to continue learning new uses.

I have tried using

library(tidyverse)
USS_reports %>%
   mutate((filter(USSfluid=2 | CTfluid=2 | MRIfluid=2))

id  USSFluid    CTfluid MRIfluid
1       1             1        1
2       1                      1    
3       1             1        1
4       1             1 
5       1             1 
6       1             1 
7       1       
8                     1     
9       1       
10                    1       2 
11      1             2

MartijnVanAttekum · Accepted Answer

as you want to give the highest value precedence, you could just use apply to take the max value per row (MARGIN = 1) of the dataframe excluding the first id column ([,-1]):

USS_reports %>% mutate(summary = apply(USS_reports[,-1], MARGIN = 1, 
FUN = function(row)max(row, na.rm = TRUE))) %>%  
mutate(summary = ifelse(summary == -Inf, NA, summary))

Note that the second mutate is needed to replace the -Inf values that are returned by max when all cols are NA with NA. For this to work, your df needs to be numeric though. If not, you would first have to do

USS_reports[] <- lapply(USS_reports, as.numeric)

(btw, if you want to test for equality in your code above, you have to use == instead of = )

Create new column in R dataframe based on results from 3 other columns

Answers (2)

Related Questions