user4130588
user4130588

Reputation: 51

R: reshaping dataframe into matrix of 1's and 0's

I'm trying to convert dataframe of this type format:

  V1 V2
  1  a
  2  a
  3  b
  4  c
  5  c

into a matrix of this format:

  V1 a  b  c
  1  1  0  0
  2  1  0  0
  3  0  1  0
  4  0  0  1
  5  0  0  1

What is the best way to do this in R? I've tried to use reshape2, but couldn't figure out a way to do this.

Upvotes: 5

Views: 129

Answers (6)

Tyler Rinker
Tyler Rinker

Reputation: 109864

Here's an approach using mtabulate from qdapTools:

library(qdapTools)
data.frame(dat[, 1, drop=F], mtabulate(setNames(dat[[2]], dat[[1]])))

##   V1 a b c
## 1  1 1 0 0
## 2  2 1 0 0
## 3  3 0 1 0
## 4  4 0 0 1
## 5  5 0 0 1

Upvotes: 0

Neal Fultz
Neal Fultz

Reputation: 9687

Here's a code-golf answer:

model.matrix(~.-1,df)
##   V1 V2a V2b V2c
## 1  1   1   0   0
## 2  2   1   0   0
## 3  3   0   1   0
## 4  4   0   0   1
## 5  5   0   0   1

Upvotes: 2

Veerendra Gadekar
Veerendra Gadekar

Reputation: 4472

Another alternative

library(tidyr)
out = cbind(dat[1], 
      apply(spread(dat, V2, V2)[-1], 2, 
            function(x) ifelse(is.na(x), 0, 1)))

#  V1 a b c
#1  1 1 0 0
#2  2 1 0 0
#3  3 0 1 0
#4  4 0 0 1
#5  5 0 0 1

more simplified as @SamFirke suggested

library(dplyr)
library(tidyr)
dat %>% mutate(x = 1) %>% spread(V2, x, fill = 0)

#  V1 a b c
#1  1 1 0 0
#2  2 1 0 0
#3  3 0 1 0
#4  4 0 0 1
#5  5 0 0 1

Upvotes: 2

Frank
Frank

Reputation: 66819

I'm not familiar with the special functions for this, but I might do...

uv <- unique(DF$V2)
m  <- matrix(0L,nrow(DF),length(uv),dimnames=list(DF$V1,uv))
m[ cbind(1:nrow(m), match(DF$V2,uv)) ] <- 1L

This is a matrix of zeros and ones, unlike the other answers so far. (Of course, small difference.)

  a b c
1 1 0 0
2 1 0 0
3 0 1 0
4 0 0 1
5 0 0 1

Upvotes: 2

SabDeM
SabDeM

Reputation: 7190

Maybe is a shortcut but that's not the same of this?

library(reshape2)
dcast(dat, V1 ~ V2, length )
Using V2 as value column: use value.var to override.
  V1 a b c
1  1 1 0 0
2  2 1 0 0
3  3 0 1 0
4  4 0 0 1
5  5 0 0 1

Upvotes: 4

A5C1D2H2I1M1N2O1R2T1
A5C1D2H2I1M1N2O1R2T1

Reputation: 193517

table should be sufficient for this:

with(mydf, cbind(V1, table(1:nrow(mydf), V2)))
##   V1 a b c
## 1  1 1 0 0
## 2  2 1 0 0
## 3  3 0 1 0
## 4  4 0 0 1
## 5  5 0 0 1

Alternatively, you can look at model.matrix:

cbind(mydf["V1"], model.matrix(~V2 + 0, mydf))
##   V1 V2a V2b V2c
## 1  1   1   0   0
## 2  2   1   0   0
## 3  3   0   1   0
## 4  4   0   0   1
## 5  5   0   0   1

Upvotes: 4

Related Questions