matsuo_basho
matsuo_basho

Reputation: 3020

Transpose rows to columns with multiple categories dplyr

I would like to use tidyr's spread function to convert a data frame with multiple id's in the rows and several columns into a df with one row where we have indicator columns for all the combinations of id's and categories. If dplyr and tidyr is not the most appropriate for this, open to other spread-like functions.

In the script below, I'm able to only specify 1 column as the value pair. I would like to have cat1 and cat2 as value columns. Also, I would like the field names to be "sentid1_cat1, sentid1_cat2" etc.

test.df <- data.frame(sentid = 1:3, 
                      cat1 = c(1,0,0), 
                      cat2 = c(0,1,0))

test.df %>%
    spread(key = sentid, value = cat1, sep = '_')

EDIT

Desired output:

output.df <- data.frame(sentid1_cat1 = 1,
                        sentid1_cat2 = 0,
                        sentid2_cat1 = 0,
                        sentid2_cat2 = 1,
                        sentid3_cat1 = 0,
                        sentid3_cat2 = 0)

Upvotes: 1

Views: 2042

Answers (1)

acylam
acylam

Reputation: 18681

A solution with dplyr + tidyr:

library(dplyr)
library(tidyr)

test.df %>%
  gather(variable, value, -sentid) %>%
  unite(variable, sentid, variable) %>%
  mutate(variable = paste0("sentid", variable)) %>%
  spread(variable, value) 

Result:

  sentid1_cat1 sentid1_cat2 sentid2_cat1 sentid2_cat2 sentid3_cat1 sentid3_cat2
1            1            0            0            1            0            0

Upvotes: 3

Related Questions