Claudio
Claudio

Reputation: 87

Merging two data.frames by two columns each

I have a huge data.frame that I want to reorder. The idea was to split it in half (as the first half contains different information than the second half) and create a third data frame which would be the combination of the two. As I always need the first two columns of the first data frame followed by the first two columns of the second data frame, I need help.

new1<-all_cont_video_algo[,1:826]
new2<-all_cont_video_algo[,827:length(all_cont_video_algo)]
df3<-data.frame()

The new data frame should look like the following:

new3[new1[1],new1[2],new2[1],new2[2],new1[3],new1[4],new2[3],new2[4],new1[5],new1[6],new2[5],new2[6], etc.].

Pseudoalgorithmically, cbind 2 columns from data frame new1 then cbind 2 columns from data frame new2 etc.

I tried the following now (thanks to Akrun):

new1<-all_cont_video_algo[,1:826]
new2<-all_cont_video_algo[,827:length(all_cont_video_algo)]

new1<-as.data.frame(new1, stringsAsFactors =FALSE)
new2<-as.data.frame(new2, stringsAsFactors =FALSE)

df3<-data.frame()
f1 <- function(Ncol, n) {
as.integer(gl(Ncol, n, Ncol))
}  
lst1 <- split.default(new1, f1(ncol(new1), 2))
lst2 <- split.default(new2, f1(ncol(new2), 2))

lst3 <- Map(function(x, y) df3[unlist(cbind(x, y))], lst1, lst2)

However, giving me a "undefined columns selected error".

Upvotes: 1

Views: 59

Answers (2)

Theo
Theo

Reputation: 575

See whether the below code helps

library(tidyverse)

# Two sample data frames of equal number of columns and rows
df1 = mtcars %>% select(-1)
df2 = diamonds %>% slice(1:32) 

# get the column names
dn1 = names(df1)
dn2 = names(df2)

# create new ordered list
neworder = map(seq(1,length(dn1),2), # sequence with interval 2
               ~c(dn1[.x:(.x+1)], dn2[.x:(.x+1)])) %>% # a vector of two columns each
  unlist %>% # flatten the list
  na.omit # remove NAs arising from odd number of columns

# Get the data frame ordered
df3 = bind_cols(df1, df2) %>% 
  select(neworder)

Upvotes: 0

akrun
akrun

Reputation: 887048

It is not clear without a reproducible example. Based on the description, we can split the dataset columns into a list of datasets and use Map to cbind the columns of corresponding datasets, unlist and use that to order the third dataset

1) Create a function to return a grouping column for splitting the dataset

f1 <- function(Ncol, n) {
 as.integer(gl(Ncol, n, Ncol))
  } 

2) split the datasets into a list

lst1 <- split.default(df1, f1(ncol(df1), 2))
lst2 <- split.default(df2, f1(ncol(df2), 2))

3) Map through the corresponding list elements, cbind and unlist and use that to subset the columns of 'df3'

lst3 <- Map(function(x, y) df3[unlist(cbind(x, y))], lst1, lst2)

data

df1 <- as.data.frame(matrix(letters[1:10], 2, 5), stringsAsFactors = FALSE)
df2 <- as.data.frame(matrix(1:10, 2, 5))

Upvotes: 0

Related Questions