deca
deca

Reputation: 740

Expand grid of all possible combinations within groups

I am facing the following problem. I have a list of M&A transactions, each transactions includes data on (1) acquirer, (2) vendor, (3) target. The data is structured in a way that the relationship can be n:n:n and looks similar to the following:

dealid acquirer target vendor
1      FirmA    FirmB  FirmC
1      FirmD           FirmE
2      .....................

So the problem is that the rows within a deal have no meaning per se, so e.g., FirmD is also a co-acquirer for FirmB.

I now need to create all possible acquirer-target-vendor combinations within each dealid. I have managed to expand grids with the expand.grid function or simply via merge. However, I do not know how to expand a grid of all possible combinations within groups.

Upvotes: 4

Views: 3678

Answers (4)

h3rm4n
h3rm4n

Reputation: 4187

With split as @Sotos mentioned in the comments:

l1 <- split(df1, df1$dealid)
l2 <- lapply(l1, function(x) unique(with(x, expand.grid(acquirer, na.omit(target), vendor))))
df2 <- cbind.data.frame(dealid = rep(names(l2), sapply(l2, nrow)), do.call(rbind, l2))

This results in:

> df2
    dealid  Var1  Var2  Var3
1.1      1 FirmA FirmB FirmC
1.2      1 FirmD FirmB FirmC
1.3      1 FirmA FirmB FirmE
1.4      1 FirmD FirmB FirmE
2.1      2 FirmA FirmF FirmC
2.2      2 FirmD FirmF FirmC
2.3      2 FirmG FirmF FirmC
2.4      2 FirmA FirmF FirmE
2.5      2 FirmD FirmF FirmE
2.6      2 FirmG FirmF FirmE

Upvotes: 0

Parfait
Parfait

Reputation: 107567

Consider base R's by, the function that slices a dataframe by factor grouping (dealid) allowing extended iterative operations such as expand.grid to return a list of dataframes. Below uses same data sample as @PLapointe and @akrun:

dfList <- by(df, df$dealid, function(i){
  tmp <- cbind(dealid=max(i$dealid),
               expand.grid(acquirer=i$acquirer, target=i$target, vendor=i$vendor))
  tmp[!is.na(tmp$target),]
})

newdf <- unique(do.call(rbind, dfList))
row.names(newdf) <- NULL

newdf
#     dealid acquirer target vendor
# 1        1    FirmA  FirmB  FirmC
# 2        1    FirmD  FirmB  FirmC
# 3        1    FirmA  FirmB  FirmE
# 4        1    FirmD  FirmB  FirmE
# 5        2    FirmA  FirmF  FirmC
# 6        2    FirmD  FirmF  FirmC
# 7        2    FirmG  FirmF  FirmC
# 8        2    FirmA  FirmF  FirmE
# 9        2    FirmD  FirmF  FirmE
# 10       2    FirmG  FirmF  FirmE

Upvotes: 1

akrun
akrun

Reputation: 886948

We can use data.table

library(data.table)
setDT(df1)[, CJ(acquirer = acquirer, target = target, vendor = vendor,
         unique = TRUE), dealid][!is.na(target)]
#    dealid acquirer target vendor
#1:      1    FirmA  FirmB  FirmC
#2:      1    FirmA  FirmB  FirmE
#3:      1    FirmD  FirmB  FirmC
#4:      1    FirmD  FirmB  FirmE
#5:      2    FirmA  FirmF  FirmC
#6:      2    FirmA  FirmF  FirmE
#7:      2    FirmD  FirmF  FirmC
#8:      2    FirmD  FirmF  FirmE
#9:      2    FirmG  FirmF  FirmC
#10:     2    FirmG  FirmF  FirmE

data

 df1 <- structure(list(dealid = c(1L, 1L, 2L, 2L, 2L), acquirer = c("FirmA", 
"FirmD", "FirmA", "FirmD", "FirmG"), target = c("FirmB", NA, 
NA, NA, "FirmF"), vendor = c("FirmC", "FirmE", "FirmC", "FirmE", 
"FirmE")), .Names = c("dealid", "acquirer", "target", "vendor"
), class = "data.frame", row.names = c(NA, -5L))

Upvotes: 3

Pierre Lapointe
Pierre Lapointe

Reputation: 16277

You could do that with dplyr and expand from tidyr.

df <- read.table(text="dealid acquirer target vendor
1      FirmA    FirmB  FirmC
1      FirmD    NA     FirmE
2      FirmA    NA     FirmC
2      FirmD    NA     FirmE
2      FirmG    FirmF  FirmE",header=TRUE,stringsAsFactors=FALSE)

library(dplyr);library(tidyr)
df%>%
  group_by(dealid)%>%
  expand(acquirer, target, vendor)

   dealid acquirer target vendor
    <int>    <chr>  <chr>  <chr>
 1      1    FirmA  FirmB  FirmC
 2      1    FirmA  FirmB  FirmE
 3      1    FirmD  FirmB  FirmC
 4      1    FirmD  FirmB  FirmE
 5      2    FirmA  FirmF  FirmC
 6      2    FirmA  FirmF  FirmE
 7      2    FirmD  FirmF  FirmC
 8      2    FirmD  FirmF  FirmE
 9      2    FirmG  FirmF  FirmC
10      2    FirmG  FirmF  FirmE

Upvotes: 7

Related Questions