Reputation: 1059
I have a dataframe with gene names and miRNA interaction information. The dataframe looks like below:
df:
Gene miRNA
ACP1 hsa-let-7a-5p
AGO4 hsa-let-7a-5p
AMMECR1 hsa-let-7a-5p
ATM hsa-miR-100-5p
BMPR2 hsa-miR-100-5p
AGO1 hsa-miR-107
AGO2 hsa-miR-107
AGO3 hsa-miR-107
Using the above information which is gene-miRNA interaction information, I wanted to create a matrix. If there is interaction I would like to assign 1 if not 0. The matrix should look like below:
hsa-let-7a-5p hsa-miR-100-5p hsa-miR-107
ACP1 1 0 0
AGO4 1 0 0
AMMECR1 1 0 0
ATM 0 1 0
BMPR2 0 1 0
AGO1 0 0 1
AGO2 0 0 1
AGO3 0 0 1
I tried using xtabs
for this. Couldn't use it correctly.
xtabs(c(1L, 0L)[miRNA] ~ ., data=df)
Result looks like below:
Gene
ACP1 AGO1 AGO2 AGO3 AGO4 AMMECR1 ATM BMPR2
1 0 0 0 1 1 0 0
Any help is appreciated. thanq.
Upvotes: 2
Views: 47
Reputation: 388817
We can create a dummy column with mutate
and use pivot_wider
to cast data into wide format.
library(dplyr)
library(tidyr) # version ‘1.0.0’
df %>%
mutate(n = 1) %>%
pivot_wider(names_from = miRNA, values_from = n, values_fill = list(n = 0))
#OR
#spread(miRNA, n, fill = 0) in old tidyr
# Gene `hsa-let-7a-5p` `hsa-miR-100-5p` `hsa-miR-107`
# <fct> <dbl> <dbl> <dbl>
#1 ACP1 1 0 0
#2 AGO4 1 0 0
#3 AMMECR1 1 0 0
#4 ATM 0 1 0
#5 BMPR2 0 1 0
#6 AGO1 0 0 1
#7 AGO2 0 0 1
#8 AGO3 0 0 1
If there is more than one row for each Gene
and miRNA
use distinct
first.
df %>%
distinct() %>%
mutate(n = 1) %>%
pivot_wider(names_from = miRNA, values_from = n, values_fill = list(n = 0))
Upvotes: 2