Reputation: 467
I'm currently trying to create a dual conditional, where, if the Cancer type (Indication) and Gene (Genes) of my data frame "mockup" both appear in another data frame called "cpg", then a new column in the mockup table has either a "yes" if both conditions are met, and a "no" if not. To illustrate this:
The mockup table has:
Indication | Genes |
---|---|
Acute Myeloid Leukemia | TP53 |
Acute Myeloid Leukemia | GNAQ |
And the cpg dataframe has:
Cancer Type | Gene |
---|---|
Acute Myeloid Leukemia | TP53 |
Acute Myeloid Leukemia | ATM |
I would like to produce a mockup table that looks like this (based on the cpg data):
Indication | Genes | Hotspot? |
---|---|---|
Acute Myeloid Leukemia | TP53 | Yes |
Acute Myeloid Leukemia | GNAQ | No |
So far I've tried (and failed) to make a for loop with a conditional to create a vector, with the hopes of then appending this vector as a new column:
hotspot <- c()
for (i in 1:nrow(mockup)){
if ((mockup$Genes[i] == cpg$Gene && mockup$Indication[i] == cpg$`Cancer Type`)){
hotspot[i] <- print("yes")
} else {
hotspot[i] <- print("no")
}
}
unique(hotspot)
As always, any help would really be appreciated!
Upvotes: 0
Views: 37
Reputation: 26238
Is this required? For loops in R, as R is already vectorised, are generally avoidable.
mockup <- read.table(text = 'Indication Genes
"Acute Myeloid Leukemia" TP53
"Acute Myeloid Leukemia" GNAQ', header = T)
cpg <- read.table(text = "Cancer_Type Gene
'Acute Myeloid Leukemia' TP53
'Acute Myeloid Leukemia' ATM", header = T)
mockup$hotspot <- apply(mockup, 1, function(x) c('No','Yes')[(all(x %in% as.matrix(cpg)))
mockup
Indication Genes hotspot
1 Acute Myeloid Leukemia TP53 Yes
2 Acute Myeloid Leukemia GNAQ No
dplyr
pipe friendly version
library(dplyr)
mockup %>% rowwise() %>%
mutate(hotspot = c('No', 'Yes')[+(all(cur_data() %in% as.matrix(cpg))+1)])
#> # A tibble: 2 x 3
#> # Rowwise:
#> Indication Genes hotspot
#> <chr> <chr> <chr>
#> 1 Acute Myeloid Leukemia TP53 Yes
#> 2 Acute Myeloid Leukemia GNAQ No
Upvotes: 2