Reputation: 3020
I would like to use tidyr
's spread function to convert a data frame with multiple id's in the rows and several columns into a df with one row where we have indicator columns for all the combinations of id's and categories. If dplyr
and tidyr
is not the most appropriate for this, open to other spread-like functions.
In the script below, I'm able to only specify 1 column as the value pair. I would like to have cat1 and cat2 as value columns. Also, I would like the field names to be "sentid1_cat1, sentid1_cat2" etc.
test.df <- data.frame(sentid = 1:3,
cat1 = c(1,0,0),
cat2 = c(0,1,0))
test.df %>%
spread(key = sentid, value = cat1, sep = '_')
EDIT
Desired output:
output.df <- data.frame(sentid1_cat1 = 1,
sentid1_cat2 = 0,
sentid2_cat1 = 0,
sentid2_cat2 = 1,
sentid3_cat1 = 0,
sentid3_cat2 = 0)
Upvotes: 1
Views: 2042
Reputation: 18681
A solution with dplyr
+ tidyr
:
library(dplyr)
library(tidyr)
test.df %>%
gather(variable, value, -sentid) %>%
unite(variable, sentid, variable) %>%
mutate(variable = paste0("sentid", variable)) %>%
spread(variable, value)
Result:
sentid1_cat1 sentid1_cat2 sentid2_cat1 sentid2_cat2 sentid3_cat1 sentid3_cat2
1 1 0 0 1 0 0
Upvotes: 3