Max
Max

Reputation: 725

Unstacking Data Frame By Columns in R

I would like to unstack a data frame in R according to two columns, i.e. starting with

id   segment  count  freq
1    a        x1a    f1a
1    b        x1b    f1b
1    c        x1c    f1c
2    a        x2a    f2a
2    b        x2b    f2b
2    c        x2c    f2c

I want to get:

id   count_a  count_b count_c freq_a freq_b freq_c
1    x1a      x1b     x1c     f1a    f1b    f1c
2    x2a      x2b     x2c     f2a    f2b    f2c

Basically, this amounts to unstacking the data frame by the first two columns id and segment. However, I don't see how to get this to work using the unstack() function in R. I can do this using a very naive approach (nested for loop, concatenating column names etc and then binding), but there must be a more straightforward and efficient way.

Upvotes: 4

Views: 1688

Answers (1)

akrun
akrun

Reputation: 887118

We can use pivot_wider

library(dplyr)
library(tidyr)
df1 %>%       
   pivot_wider(names_from = c(segment), values_from = c(count, freq))
# A tibble: 2 x 7
#     id count_a count_b count_c freq_a freq_b freq_c
#  <int> <chr>   <chr>   <chr>   <chr>  <chr>  <chr> 
#1     1 x1a     x1b     x1c     f1a    f1b    f1c   
#2     2 x2a     x2b     x2c     f2a    f2b    f2c   

Or with dcast

library(data.table)
dcast(setDT(df1), id ~ segment, value.var = c('count', 'freq'))
#   id count_a count_b count_c freq_a freq_b freq_c
#1:  1     x1a     x1b     x1c    f1a    f1b    f1c
#2:  2     x2a     x2b     x2c    f2a    f2b    f2c

Update

If there are duplicates, then create a sequence column

df1 %>%
   mutate(rn = rowid(segment)) %>%
    pivot_wider(names_from = c(segment), values_from = c(count, freq)) %>%
   select(-rn)

Or with data.table

dcast(setDT(df1), id + rowid(segment) ~ segment, 
       alue.var = c('count', 'freq'))[, segment := NULL][]

data

df1 <- structure(list(id = c(1L, 1L, 1L, 2L, 2L, 2L), segment = c("a", 
"b", "c", "a", "b", "c"), count = c("x1a", "x1b", "x1c", "x2a", 
"x2b", "x2c"), freq = c("f1a", "f1b", "f1c", "f2a", "f2b", "f2c"
)), class = "data.frame", row.names = c(NA, -6L))

Upvotes: 2

Related Questions