Reputation: 77
I have a data set which looks like below:
Comp1 Product Comp2
A P1 B
A P2 B
A P3 B
C P4 D
C P2 D
X P1 Y
X P2 Y
X P3 Y
So basically, Comp1
and Comp2
are the companies and Product
is the product name that these companies have in common. I want the output to appear something like this:
Product Bundle Count
P1,P2,P3 2
P2,P4 1
I am new to R and would appreciate any help in this case.
Upvotes: 3
Views: 96
Reputation: 992
If you prefer using base r as I do, this is an idea:
dtb <- table(paste(dd[[1]],dd[[3]]),dd[[2]])
out <- sapply(1:nrow(dtb),function(x) paste(colnames(dtb)[dtb[x,] ==
1],collapse = ","))
table(out)
out
P1,P2,P3 P2,P4
2 1
Upvotes: 3
Reputation: 83215
A solution with data.table
:
library(data.table)
setDT(d)[order(Product), Prod.Bundle := toString(Product), by = Comp1
][, .(Count = uniqueN(Comp2)), by = Prod.Bundle]
Or another as provided by @Frank in the comments:
setDT(d)[order(Product), toString(Product), by = Comp1
][, .(Count = .N), by = .(Prod.Bundle = V1)]
which gives:
Prod.Bundle Count 1: P1, P2, P3 2 2: P2, P4 1
Used data:
d <- read.table(text="Comp1 Product Comp2
A P1 B
A P2 B
A P3 B
C P4 D
C P2 D
X P1 Y
X P2 Y
X P3 Y", header=TRUE, stringsAsFactors=FALSE)
Upvotes: 3
Reputation: 206187
Using dplyr
, you can summarize the data then count it. For example
library(dplyr)
dd %>% arrange(Comp1, Product) %>%
group_by(Comp1) %>%
summarize(bundle=paste(unique(Product), collapse=",")) %>%
count(bundle)
# bundle n
# <chr> <int>
# 1 P1,P2,P3 2
# 2 P2,P4 1
with the test data
dd <- read.table(text="Comp1 Product Comp2
A P1 B
A P2 B
A P3 B
C P4 D
C P2 D
X P1 Y
X P2 Y
X P3 Y", header=TRUE)
Upvotes: 4