Transpose rows to columns with multiple categories dplyr

Question

I would like to use tidyr's spread function to convert a data frame with multiple id's in the rows and several columns into a df with one row where we have indicator columns for all the combinations of id's and categories. If dplyr and tidyr is not the most appropriate for this, open to other spread-like functions.

In the script below, I'm able to only specify 1 column as the value pair. I would like to have cat1 and cat2 as value columns. Also, I would like the field names to be "sentid1_cat1, sentid1_cat2" etc.

test.df <- data.frame(sentid = 1:3, 
                      cat1 = c(1,0,0), 
                      cat2 = c(0,1,0))

test.df %>%
    spread(key = sentid, value = cat1, sep = '_')

EDIT

Desired output:

output.df <- data.frame(sentid1_cat1 = 1,
                        sentid1_cat2 = 0,
                        sentid2_cat1 = 0,
                        sentid2_cat2 = 1,
                        sentid3_cat1 = 0,
                        sentid3_cat2 = 0)

acylam · Accepted Answer

A solution with dplyr + tidyr:

library(dplyr)
library(tidyr)

test.df %>%
  gather(variable, value, -sentid) %>%
  unite(variable, sentid, variable) %>%
  mutate(variable = paste0("sentid", variable)) %>%
  spread(variable, value)

Result:

  sentid1_cat1 sentid1_cat2 sentid2_cat1 sentid2_cat2 sentid3_cat1 sentid3_cat2
1            1            0            0            1            0            0

Transpose rows to columns with multiple categories dplyr

Answers (1)

Related Questions