Reputation: 720
I have a data.frame
sources <- c("a", "b", "c", "a", "a", "b")
targets <- c("f", "v", "w", "a", "b", "c")
values <- c(12, 34, 56, 3, 76, 35)
df <- data.frame(sources, targets, values)
sources targets values
1 a f 12
2 b v 34
3 c w 56
4 a a 3
5 a b 76
6 b c 35
How to reshape it to get a matrix with sources as row names, targets as colnames and corresponding values in intersecting cells? In this case, I should have a 3 by 6 matrix. something like this:
f v w a b c
a 12 0 0 3 76 0
b 0 34 0 0 0 35
c 0 0 56 0 0 0
Upvotes: 2
Views: 358
Reputation: 76950
If the resulting matrix is expected to be sparse (and large), then you may want to directly convert to a sparse matrix data structure. This is helped by converting the source
and target
entries to factors.
library(magrittr)
library(Matrix)
df['sources'] <- factor(df$sources)
df['targets'] <- factor(df$targets)
df %$%
sparseMatrix(i=as.integer(sources),
j=as.integer(targets),
x=values,
dimnames=list(sources=levels(sources),
targets=levels(targets)))
which results in
3 x 6 sparse Matrix of class "dgCMatrix"
targets
sources a b c f v w
a 3 76 . 12 . .
b . . 35 . 34 .
c . . . . . 56
Row and column orders can be controlled by specifying levels
in the factor conversion. Please note the use of magrittr
's exposition pipe (%$%
).
Upvotes: 0
Reputation: 35594
Tou could use xtabs()
:
xtabs(values ~ sources + targets, df)
# a b c f v w
# a 3 76 0 12 0 0
# b 0 0 35 0 34 0
# c 0 0 0 0 0 56
If you want the output to be a data.frame
, just pass it into as.data.frame.matrix()
.
Upvotes: 3
Reputation: 73385
We can use matrix indexing.
sources <- c("a", "b", "c", "a", "a", "b")
targets <- c("f", "v", "w", "a", "b", "c")
values <- as.numeric(c(12, 34, 56, 3, 76, 35))
rn <- unique(sources)
cn <- unique(targets)
mat <- matrix(0, length(rn), length(cn), dimnames = list(rn, cn))
mat[cbind(sources, targets)] <- values
mat
# f v w a b c
#a 12 0 0 3 76 0
#b 0 34 0 0 0 35
#c 0 0 56 0 0 0
Another approach similar to xtabs
, is
tapply(values, list(sources, targets), FUN = sum, default = 0)
# a b c f v w
#a 3 76 0 12 0 0
#b 0 0 35 0 34 0
#c 0 0 0 0 0 56
I think xtabs
and tapply
are concise. But I don't like the factor()
coercion that happens under the hood.
Upvotes: 2
Reputation: 15143
You may try
library(tidyverse)
df <- data.frame(sources, targets, values)
df %>%
pivot_wider(names_from = targets, values_from = values) %>%
replace(is.na(.), 0) %>%
column_to_rownames(var = "sources")
f v w a b c
a 12 0 0 3 76 0
b 0 34 0 0 0 35
c 0 0 56 0 0 0
Upvotes: 1