Reputation: 23
I have this situation, in which I have about 200.000 observations ("source") that follow 8 different "target".
If they follow that target is 1 otherwise is 0 like in the simplified example below:
source | target1 | target2 | target3 |
---|---|---|---|
source1 | 1 | 0 | 1 |
source2 | 0 | 1 | 1 |
source3 | 1 | 1 | 1 |
Now, I want to know who follow more target and, consequently, how many times a same source follows more than one target, in other words, in each cell I want to know how many times both conditions are verified.
This would be the idea:
(blank) | target1 | target2 | target3 |
---|---|---|---|
target1 | 2 | 1 | 2 |
target2 | 1 | 2 | 2 |
target3 | 2 | 2 | 3 |
Upvotes: 0
Views: 43
Reputation: 21937
library(dplyr)
dat <- tibble::tribble(
~source, ~target1, ~target2, ~target3,
"source1", 1, 0, 1,
"source2", 0, 1, 1,
"source3", 1, 1, 1)
mat <- dat %>% select(-source) %>% as.matrix()
crossprod(mat)
#> target1 target2 target3
#> target1 2 1 2
#> target2 1 2 2
#> target3 2 2 3
Created on 2022-11-27 by the reprex package (v2.0.1)
Upvotes: 0