mowglis_diaper
mowglis_diaper

Reputation: 509

Generate all possible combinations of rows in R?

Let's say I have two dataframes, students and teachers.

students <- data.frame(name = c("John", "Mary", "Sue", "Mark", "Gordy", "Joey", "Marge", "Sheev", "Lisa"),
                   height = c(111, 93, 99, 107, 100, 123, 104, 80, 95),
                   smart = c("no", "no", "yes", "no", "yes", "yes", "no", "yes", "no"))
teachers <- data.frame(name = c("Ben", "Craig", "Mindy"),
                   height = c(130, 101, 105),
                   smart = c("yes", "yes", "yes"))

I want to generate all possible combinations of students and teachers and keep the accompanying info, basically create all combinations of rows from dataframe "students" and "teachers". This can easily be done with a loop and cbind, but for a massive dataframe, this takes forever. Help an R newbie out -- what's the fastest way to do this?

Edit: If this isn't clear, I want the output to have the following format:

rbind(
  cbind(students[1, ], teachers[1, ]), 
  cbind(students[1, ], teachers[2, ]) 
  ...
  cbind(students[n, ], teachers[n, ]))

Upvotes: 5

Views: 1509

Answers (3)

Onyambu
Onyambu

Reputation: 79338

You can combine all the data as below:

do.call(cbind.data.frame,Map(expand.grid,teacher=teachers,students=students))

   name.teacher name.students height.teacher height.students smart.teacher smart.students
1           Ben          John            130             111           yes             no
2         Craig          John            101             111           yes             no
3         Mindy          John            105             111           yes             no
4           Ben          Mary            130              93           yes             no
5         Craig          Mary            101              93           yes             no
6         Mindy          Mary            105              93           yes             no
:            :            :                :               :            :              :
:            :            :                :               :            :              :

Upvotes: 4

S Rivero
S Rivero

Reputation: 708

You can use this function

expand.grid.df <- function(...) Reduce(function(...) merge(..., by=NULL), list(...))

expand.grid.df(students,teachers)

Upvotes: 0

Frank
Frank

Reputation: 66819

and keep the accompanying info

I would recommend not doing this. There is no need to have everything in a single object.

To just combine the teachers and students, there's

res = expand.grid(teacher_name = teachers$name, student_name = students$name)

To merge in the other data (which I would recommend not doing until necessary):

res[, paste("teacher", c("height", "smart"), sep="_")] <- 
  teachers[match(res$teacher_name, teachers$name), c("height","smart")]

res[, paste("student", c("height", "smart"), sep="_")] <- 
  students[match(res$student_name, students$name), c("height","smart")]

This gives

head(res)

  teacher_name student_name teacher_height teacher_smart student_height student_smart
1          Ben         John            130           yes            111            no
2        Craig         John            101           yes            111            no
3        Mindy         John            105           yes            111            no
4          Ben         Mary            130           yes             93            no
5        Craig         Mary            101           yes             93            no
6        Mindy         Mary            105           yes             93            no

Upvotes: 2

Related Questions