Reputation: 1039
I have a data frame of the following way
dat <- data.frame(A=c("D", "A", "D", "B"), B=c("B", "B", "D", "R"), C=c("A", "D", "C", ""), D=c("D", "C", "A", "A"))
My idea is to create a matrix with this information, based on the number of occasions that each column variable refers to the other columns (and ignore when referring to other things that are not in one of the columns (e.g. "R"
)). So I want to fill the following matrix:
n <- ncol(dat)
names_d <- colnames(dat)
mat <- matrix(0, nrow=n, ncol=n)
rownames(mat) <- names_d
colnames(mat) <- names_d
So in the end, I would have something like this:
A B C D
A 1 1 0 2
B 0 2 0 1
C 1 0 1 1
D 2 0 1 1
Which would be the most efficient way of doing this in R?
Upvotes: 1
Views: 200
Reputation: 887118
Another option is stack
with table
table(subset(stack(dat), nzchar(values) & values != 'R'))
Upvotes: 1
Reputation: 101373
You can try the code below
> t(sapply(dat, function(x) table(factor(x, levels = names(dat)))))
A B C D
A 1 1 0 2
B 0 2 0 1
C 1 0 1 1
D 2 0 1 1
or
> t(xtabs(~., subset(stack(dat), values != "")))
values
ind A B C D
A 1 1 0 2
B 0 2 0 1
C 1 0 1 1
D 2 0 1 1
Upvotes: 3