Reputation: 221
I've got some long format data that 1) needs to be reshaped to wide and then 2) needs the columns resorted according pattern of their names. The example data is below:
#Orignial data
set.seed(100)
long_df <- tibble(id = rep(1:5, each = 3),
group = rep(c('g1','g2','g3'), times = 5),
mean = runif(15, min = 1, max = 10),
sd = runif(15, min = .025, max = 1))
long_df
# A tibble: 15 x 4
id group mean sd
<int> <chr> <dbl> <dbl>
1 1 g1 3.77 0.677
2 1 g2 3.32 0.224
3 1 g3 5.97 0.374
4 2 g1 1.51 0.375
5 2 g2 5.22 0.698
6 2 g3 5.35 0.547
7 3 g1 8.31 0.718
8 3 g2 4.33 0.550
9 3 g3 5.92 0.755
10 4 g1 2.53 0.435
11 4 g2 6.62 0.192
12 4 g3 8.94 0.776
13 5 g1 3.52 0.885
14 5 g2 4.59 0.560
15 5 g3 7.86 0.296
#Reshaped to wide
wide_df <- long_df %>%
pivot_wider(id_cols = id, names_from = 'group', values_from = c('mean','sd'))
wide_df
# A tibble: 5 x 7
id mean_g1 mean_g2 mean_g3 sd_g1 sd_g2 sd_g3
<int> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 1 3.77 3.32 5.97 0.677 0.224 0.374
2 2 1.51 5.22 5.35 0.375 0.698 0.547
3 3 8.31 4.33 5.92 0.718 0.550 0.755
4 4 2.53 6.62 8.94 0.435 0.192 0.776
5 5 3.52 4.59 7.86 0.885 0.560 0.296
#Wide with proper column order
final_df <- wide_df %>%
select(id, mean_g1, sd_g1, mean_g2, sd_g2, mean_g3, sd_g3)
final_df
# A tibble: 5 x 7
id mean_g1 sd_g1 mean_g2 sd_g2 mean_g3 sd_g3
<int> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 1 3.77 0.677 3.32 0.224 5.97 0.374
2 2 1.51 0.375 5.22 0.698 5.35 0.547
3 3 8.31 0.718 4.33 0.550 5.92 0.755
4 4 2.53 0.435 6.62 0.192 8.94 0.776
5 5 3.52 0.885 4.59 0.560 7.86 0.296
Does someone know a way to either a) pivot the long data with the output being in the proper order or b) use a regex or sort variable to reorder the columns into the desired order? A tidyverse style solution would be preferred, but other approaches are welcome too.
Upvotes: 6
Views: 1857
Reputation: 93871
You can run ends_with
successively on a character vector of the numeric suffixes:
long_df %>%
pivot_wider(names_from=group, values_from=c(mean, sd)) %>%
select(id, ends_with(as.character(1:3)))
id mean_g1 sd_g1 mean_g2 sd_g2 mean_g3 sd_g3 <int> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> 1 1 3.77 0.677 3.32 0.224 5.97 0.374 2 2 1.51 0.375 5.22 0.698 5.35 0.547 3 3 8.31 0.718 4.33 0.550 5.92 0.755 4 4 2.53 0.435 6.62 0.192 8.94 0.776 5 5 3.52 0.885 4.59 0.560 7.86 0.296
This might need the development version of dplyr
to work (devtools::install_github("tidyverse/dplyr")
). I'm running the development version and haven't checked this code with the current CRAN release.
You can also pivot_longer
to the "longest" version of the data frame and then pivot_wider
to get the desired column order without resorting:
long_df %>%
pivot_longer(cols=c(mean, sd)) %>%
pivot_wider(names_from=c(name, group), values_from=value)
id mean_g1 sd_g1 mean_g2 sd_g2 mean_g3 sd_g3 <int> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> 1 1 3.77 0.677 3.32 0.224 5.97 0.374 2 2 1.51 0.375 5.22 0.698 5.35 0.547 3 3 8.31 0.718 4.33 0.550 5.92 0.755 4 4 2.53 0.435 6.62 0.192 8.94 0.776 5 5 3.52 0.885 4.59 0.560 7.86 0.296
Upvotes: 4