Reputation: 1275
I have a data frame (df) where in I want to match each column with the last column in order to provide new values to each of those columns.
Here is my example data frame (df):
> df
S1 S2 S3 S4 S5 main
Gene1 1 1 1 1 2 1
Gene2 1 2 1 1 1 1
Gene3 1 1 1 1 2 2
Gene4 2 1 1 1 1 1
Gene5 1 2 1 2 1 1
Gene6 1 1 1 1 1 2
Gene7 NA NA 2 1 1 1
Gene8 1 2 1 1 1 2
Gene9 2 1 1 2 1 1
I want to match each of my columns from 1 to 5 with the last column with the following conditions. 'S' below refers to each column from 1 to 5.
If S = 2 and main = 2, then value is True Positive (TP)
If S = 2 and main = 1, then value is False Positive (FP)
If S = 1 and main = 2, then value is False Negative (FN)
If S = 1 and main = 1, then value is True Negative (TN)
And NAs to remain as NAs.
And therefore my new data frame (df_updated) should look like this.
> df_updated
S1 S2 S3 S4 S5
Gene1 TN TN TN TN FP
Gene2 TN FP TN TN TN
Gene3 FN FN FN FN TP
Gene4 FP TN TN TN TN
Gene5 TN FP TN FP TN
Gene6 FN FN FN FN FN
Gene7 NA NA FP TN TN
Gene8 FN TP FN FN FN
Gene9 FP TN TN FP TN
I am aware of the match functions, but I am not sure how to loop them and use these above specific matches for each of the columns.
Any help appreciated, Thank you.
Upvotes: 2
Views: 172
Reputation: 388982
Using base R, you can also create a function with nested ifelse
and apply the function to every column and get values.
get_value <- function(x,main) {
ifelse(main == 2 & x == 2, "TP",
ifelse(main == 1 & x == 2, "FP",
ifelse(main == 2 & x == 1, "FN",
ifelse(main == 1 & x == 1 ,"TN", NA))))
}
df1 <- df[-ncol(df)]
df1[] <- lapply(df1, get_value, df$main)
df1
# S1 S2 S3 S4 S5
#Gene1 TN TN TN TN FP
#Gene2 TN FP TN TN TN
#Gene3 FN FN FN FN TP
#Gene4 FP TN TN TN TN
#Gene5 TN FP TN FP TN
#Gene6 FN FN FN FN FN
#Gene7 <NA> <NA> FP TN TN
#Gene8 FN TP FN FN FN
#Gene9 FP TN TN FP TN
Upvotes: 2
Reputation: 6234
You could use dplyr's case_when
:
library(dplyr)
mutate_all(df, ~case_when(
.x < main ~ "FN",
.x > main ~ "FP",
near(.x, 1) & near(.x, main) ~ "TN",
near(.x, 2) & near(.x, main) ~ "TP"
)) %>%
select(-main)
#> S1 S2 S3 S4 S5
#> 1 TN TN TN TN FP
#> 2 TN FP TN TN TN
#> 3 FN FN FN FN TP
#> 4 FP TN TN TN TN
#> 5 TN FP TN FP TN
#> 6 FN FN FN FN FN
#> 7 <NA> <NA> FP TN TN
#> 8 FN TP FN FN FN
#> 9 FP TN TN FP TN
Upvotes: 3