Reputation: 117
I have a dataset that has gene names. The gene names In that dataset I wanted to extract them from two other datasets which are gd
and cd
.
common_genes
is a vector that has the the gene names that I want to search
I want assistance on how, I can be able to have common columns in both the cd
and gd
datasets, using the common genes. This is because my analysis will require me to do comparisons between those two datasets.
#Extract those that are present in the `gd` dataset.
common_genes <- intersect(gene_names, colnames(gd))
# extract these 300 genes too from the `gd` for common genes
A <- gd[, common_genes]
#Extract these 300 genes too from the `cd` dataset.
common_genes2 <- intersect(gene_names, colnames(cd))
B<-cd[,common_genes]
The output I get is for A
150 genes out of 300 and B
200 genes out of 300.
My desired output is the example below:
A
RPL26 MS4A1 ELK1 SNIP1
200 300 400 534
B
RPL26 MS4A1 ELK1 SNIP1
100 81 91 112
Upvotes: 1
Views: 55
Reputation: 12558
Since your example isn't reproducible, I created my own
df1 <- data.frame(
a = 1:3,
b = 2:4,
c = 3:5)
df2 <- data.frame(
b = 4:6,
c = 5:7,
d = 6:8)
# it's unclear to me why you wouldn't think to use intersect, when it's right there in your question?
common_cols <- intersect(colnames(df1), colnames(df2))
df1 <- df1[,common_cols]
df2 <- df2[,common_cols]
df1 afterwards:
b c
1 2 3
2 3 4
3 4 5
Upvotes: 1