Reputation: 725
I would like to unstack a data frame in R according to two columns, i.e. starting with
id segment count freq
1 a x1a f1a
1 b x1b f1b
1 c x1c f1c
2 a x2a f2a
2 b x2b f2b
2 c x2c f2c
I want to get:
id count_a count_b count_c freq_a freq_b freq_c
1 x1a x1b x1c f1a f1b f1c
2 x2a x2b x2c f2a f2b f2c
Basically, this amounts to unstacking the data frame by the first two columns id and segment. However, I don't see how to get this to work using the unstack() function in R. I can do this using a very naive approach (nested for loop, concatenating column names etc and then binding), but there must be a more straightforward and efficient way.
Upvotes: 4
Views: 1688
Reputation: 887118
We can use pivot_wider
library(dplyr)
library(tidyr)
df1 %>%
pivot_wider(names_from = c(segment), values_from = c(count, freq))
# A tibble: 2 x 7
# id count_a count_b count_c freq_a freq_b freq_c
# <int> <chr> <chr> <chr> <chr> <chr> <chr>
#1 1 x1a x1b x1c f1a f1b f1c
#2 2 x2a x2b x2c f2a f2b f2c
Or with dcast
library(data.table)
dcast(setDT(df1), id ~ segment, value.var = c('count', 'freq'))
# id count_a count_b count_c freq_a freq_b freq_c
#1: 1 x1a x1b x1c f1a f1b f1c
#2: 2 x2a x2b x2c f2a f2b f2c
If there are duplicates, then create a sequence column
df1 %>%
mutate(rn = rowid(segment)) %>%
pivot_wider(names_from = c(segment), values_from = c(count, freq)) %>%
select(-rn)
Or with data.table
dcast(setDT(df1), id + rowid(segment) ~ segment,
alue.var = c('count', 'freq'))[, segment := NULL][]
df1 <- structure(list(id = c(1L, 1L, 1L, 2L, 2L, 2L), segment = c("a",
"b", "c", "a", "b", "c"), count = c("x1a", "x1b", "x1c", "x2a",
"x2b", "x2c"), freq = c("f1a", "f1b", "f1c", "f2a", "f2b", "f2c"
)), class = "data.frame", row.names = c(NA, -6L))
Upvotes: 2