Merge two dataframes based on common column names

Question

I have 2 data frames:

df1 (all genes and their expression values -- each column name is a gene)

df2 (list of genes to analyse -- each gene is a column name, without any extra data)

And basically I want to merge them by the column names, obtaining a third data frame that is df1 but with only the genes present on both data frames (common column names).

I don't know if I explained well but let me know if I can provide more info.

Example of data frames:

df1 <- data.frame(matrix(ncol = 4, nrow = 0))
x1 <- c("name", "school", "job", "gender")
colnames(df1) <- x1

df2 <- data.frame(matrix(ncol = 3, nrow = 0))
x2 <- c("name", "age", "gender")
colnames(df2) <- x2

Basically here what I would want is df1 but reduced to columns present on both df1 and df2, and that would be "name" and "gender". But in my work, I have many genes so I cannot do it gene by gene.

Thank you!

akrun · Accepted Answer

We can use intersect on the column names of both 'df1' and 'df2' to select the columns of 'df1'

df1new <- df1[intersect(names(df1), names(df2))]

Or with dplyr

library(dplyr)
df1new <- df1 %>%
            select(intersect(names(.), names(df2))

Merge two dataframes based on common column names

Answers (1)

Related Questions