Alienfluid
Alienfluid

Reputation: 326

Converting string column to dummy variables with duplicate keys

I am trying to convert this -

> df.orig <- data.frame(id = c('foo', 'bar', 'foo'), action = c('abc','def','ghi'))
> df.orig
   id action
1 foo    abc
2 bar    def
3 foo    ghi

Into:

> df.new <- data.frame(id = c('foo', 'bar'), action_abc = c(1,0), action_def = c(0,1), action_ghi = c(1,0))
> df.new
   id action_abc action_def action_ghi
1 foo          1          0          1
2 bar          0          1          0

sparse.model.matrix and dcast seem to not handle multiple keys ('foo') very well.

> sparse.model.matrix(id ~ action - 1, df.orig)
3 x 3 sparse Matrix of class "dgCMatrix"
  actionabc actiondef actionghi
1         1         .         .
2         .         1         .
3         .         .         1

Upvotes: 1

Views: 44

Answers (1)

BENY
BENY

Reputation: 323306

By using table

  df <- data.frame(id = c('foo', 'bar', 'foo'), action = c('abc','def','ghi'),stringsAsFactors = F)

  table(df$id,df$action)

      abc def ghi
  bar   0   1   0
  foo   1   0   1

Upvotes: 2

Related Questions