AnilGoyal
AnilGoyal

Reputation: 26218

How to cross tabulate the summary values across same field

This may have solutions/answers available here, but I am unable to find.

Let us assume a simple data like this

x <- data.frame(id = rep(1:3, each = 2),
                v1 = c('A', 'B', 'A', 'B', 'A', 'C'))

> x
  id v1
1  1  A
2  1  B
3  2  A
4  2  B
5  3  A
6  3  C

Now I want an output of relation of V1 column with itself, but across group on id something like this

  v1        A     B     C
1 A         0     2     1
2 B         2     0     0
3 C         1     0     0

So, I proceeded like this..

library(tidyverse)

#merged the V1 column by itself with all = TRUE
x <- merge(x, x, by = "id", all = T)

# removed same group rows
x <- x[x$v1.x != x$v1.y, ]

# final code
x %>% select(-id) %>%
  group_by(v1.x, v1.y) %>%
  summarise(val = n()) %>% 
  pivot_wider(names_from = v1.y, values_from = val, values_fill = 0L, names.sort = T)

# A tibble: 3 x 4
# Groups:   v1.x [3]
  v1.x      A     B     C
  <chr> <int> <int> <int>
1 A         0     2     1
2 B         2     0     0
3 C         1     0     0

My question is that any better/direct method to obtain the cross-table?

Upvotes: 1

Views: 56

Answers (1)

Ben
Ben

Reputation: 30474

How about creating a contingency table with xtabs (which can work with large data sets as well). Then, you can use crossprod on the table and set the diagonal to zero for the final result.

ct <- xtabs(~ id + v1, data = x)
cp <- crossprod(ct, ct)
diag(cp) <- 0
cp

Instead of xtabs you can create a cross-table with simply table as well. As noted by @A5C1D2H2I1M1N2O1R2T1, you can simplify to a nice one-liner equivalent:

"diag<-"(crossprod(table(x)), 0) 

Output

   v1
v1  A B C
  A 0 2 1
  B 2 0 0
  C 1 0 0

Upvotes: 2

Related Questions