Yulia Kentieva
Yulia Kentieva

Reputation: 720

How to "unmelt" a data.frame to matrix?

I have a data.frame

sources <- c("a", "b", "c", "a", "a", "b")
targets <- c("f", "v", "w", "a", "b", "c")
values <- c(12, 34, 56, 3, 76, 35)
df <- data.frame(sources, targets, values)

  sources targets values
1       a       f     12
2       b       v     34
3       c       w     56
4       a       a      3
5       a       b     76
6       b       c     35

How to reshape it to get a matrix with sources as row names, targets as colnames and corresponding values in intersecting cells? In this case, I should have a 3 by 6 matrix. something like this:

    f  v  w  a  b  c
a  12  0  0  3  76 0
b  0  34  0  0  0 35 
c  0  0   56 0  0  0

Upvotes: 2

Views: 358

Answers (4)

merv
merv

Reputation: 76950

Sparse Matrices

If the resulting matrix is expected to be sparse (and large), then you may want to directly convert to a sparse matrix data structure. This is helped by converting the source and target entries to factors.

library(magrittr)
library(Matrix)

df['sources'] <- factor(df$sources)
df['targets'] <- factor(df$targets)

df %$% 
  sparseMatrix(i=as.integer(sources),
               j=as.integer(targets),
               x=values,
               dimnames=list(sources=levels(sources),
                             targets=levels(targets)))

which results in

3 x 6 sparse Matrix of class "dgCMatrix"
       targets
sources a  b  c  f  v  w
      a 3 76  . 12  .  .
      b .  . 35  . 34  .
      c .  .  .  .  . 56

Row and column orders can be controlled by specifying levels in the factor conversion. Please note the use of magrittr's exposition pipe (%$%).

Upvotes: 0

Darren Tsai
Darren Tsai

Reputation: 35594

Tou could use xtabs():

xtabs(values ~ sources + targets, df)

#   a  b  c  f  v  w
# a 3 76  0 12  0  0
# b 0  0 35  0 34  0
# c 0  0  0  0  0 56

If you want the output to be a data.frame, just pass it into as.data.frame.matrix().

Upvotes: 3

Zheyuan Li
Zheyuan Li

Reputation: 73385

We can use matrix indexing.

sources <- c("a", "b", "c", "a", "a", "b")
targets <- c("f", "v", "w", "a", "b", "c")
values <- as.numeric(c(12, 34, 56, 3, 76, 35))

rn <- unique(sources)
cn <- unique(targets)
mat <- matrix(0, length(rn), length(cn), dimnames = list(rn, cn))
mat[cbind(sources, targets)] <- values
mat
#   f  v  w a  b  c
#a 12  0  0 3 76  0
#b  0 34  0 0  0 35
#c  0  0 56 0  0  0

Another approach similar to xtabs, is

tapply(values, list(sources, targets), FUN = sum, default = 0)
#  a  b  c  f  v  w
#a 3 76  0 12  0  0
#b 0  0 35  0 34  0
#c 0  0  0  0  0 56

I think xtabs and tapply are concise. But I don't like the factor() coercion that happens under the hood.

Upvotes: 2

Kra.P
Kra.P

Reputation: 15143

You may try

library(tidyverse)

df <- data.frame(sources, targets, values)

df %>%
  pivot_wider(names_from = targets, values_from = values) %>%
  replace(is.na(.), 0) %>%
  column_to_rownames(var = "sources")


   f  v  w a  b  c
a 12  0  0 3 76  0
b  0 34  0 0  0 35
c  0  0 56 0  0  0

Upvotes: 1

Related Questions