alwaysaskingquestions
alwaysaskingquestions

Reputation: 1657

R How to find and reorder columns in one table according to column order in another table (tables contain diff number of columns)

I have read the following threads that seem to be similar (but still different) questions with mine:

  1. Order data frame rows according to a target vector that specifies the desired order
  2. How to reorder data.table columns (without copying)

my questions are different because in thread 1, the two tables have same length, and i tried using method indicated in thread 2, but seems like they also have to be same length.

for illustration purpose, i'll create two tables as following:

table1 = data.frame(rbind(c(rep(c(TRUE,FALSE), 3)), c(rep(TRUE, 4), rep(FALSE, 2))))
dim(table1)
setnames(table1, letters[1:6])
table1
  a     b    c     d     e     f
1 TRUE FALSE TRUE FALSE  TRUE FALSE
2 TRUE  TRUE TRUE  TRUE FALSE FALSE

table2 = data.frame(rbind(c(rep(c(TRUE,FALSE), 2)), c(rep(TRUE, 3), rep(FALSE, 1))))
dim(table2)
setnames(table2, letters[7:4])
table2
  g     f    e     d
1 TRUE FALSE TRUE FALSE
2 TRUE  TRUE TRUE FALSE

the desired return would be:

  f     e      d
1 FALSE TRUE  FALSE
2 FALSE FALSE TRUE

the return table will fit the following criteria:

  1. contains ONLY the letters exist in BOTH tables
  2. order of the return table is consistent with the order in table 2

Basically, I want to clean my table1 to contain only column names that also exist in table2, and i want the table1 columns to be reordered in order same as the columns in table 2.

Let me know if I need to clarify anything else! Thank you!

Upvotes: 0

Views: 2536

Answers (2)

alwaysaskingquestions
alwaysaskingquestions

Reputation: 1657

Thanks to Sowmya for sharing ideas with me. Per request, below is my version of the answer based on what I've learned from Sowmya's answer and rawr's comments.

table1 = data.frame(rbind(c(rep(c(TRUE,FALSE), 3)), c(rep(TRUE, 4), rep(FALSE, 2))))
dim(table1)
setnames(table1, letters[1:6])

table2 = data.frame(rbind(c(rep(c(TRUE,FALSE), 2)), c(rep(TRUE, 3), rep(FALSE, 1))))
dim(table2)
setnames(table2, letters[7:4])

# below are my answer
col2keep = intersect(colnames(table2), colnames(table1))
table1_reduce = table1[, (match(col2keep, colnames(table1)))]

and table1_reduce will contain my desired output.

Upvotes: 2

Sowmya S. Manian
Sowmya S. Manian

Reputation: 3843

Install and load dplyr package for select function. Other functions used match, order, is.na, which and operator %in%.

  install.packages("plyr")
  install.packages("dplyr")
  library(plyr)
  library(dplyr)

  table1 = data.frame(rbind(c(rep(c(TRUE,FALSE), 3)), c(rep(TRUE, 4), rep(FALSE, 2))))
  dim(table1)
  names(table1) <- letters[1:6]

  table2 = data.frame(rbind(c(rep(c(TRUE,FALSE), 2)), c(rep(TRUE, 3), rep(FALSE, 1))))
  dim(table2)
  names(table2) <- letters[7:4]

  target <- as.vector(names(table2)) 
  t <- select(table1, which(names(table1) %in% target))
  gg <- match (target,names(t)) 
  gg <- gg[!is.na(gg)]
  Result <- t[,order(gg)]
  Result

Upvotes: 0

Related Questions