Mark Miller
Mark Miller

Reputation: 13103

Reshape multiple columns with group in R

I have a data set, my.data, with multiple columns of numeric data (x.1, x.3 and x.5) and two groups (CC).

my.data <- read.table(text = '
    AA  BB  CC   x.1   x.3   x.5
   aaa   3   1     5    10    15
   aaa   3   2    20    25    30
   aaa   4   1    50   100   150
   aaa   4   2   200   250   300
   aaa   5   1   500  1000  1500
   aaa   5   2  2000  2500  3000
', header = TRUE, stringsAsFactors = FALSE)

I want to reshape x.1, x.3 and x.5 by CC. Either of the following two data sets is ideal. I realize this is probably a duplicate, but I have looked at probably two dozen reshape questions so far without finding a similar question.

desired.result.1 <- read.table(text = '
      CC   x.1   x.3   x.5
       1     5    50   500
       1    10   100  1000
       1    15   150  1500
       2    20   200  2000
       2    25   250  2500
       2    30   300  3000 
', header = TRUE, stringsAsFactors = FALSE)

desired.result.2 <- read.table(text = '
      CC   x.1   x.3   x.5
       1     5    50   500
       2    20   200  2000
       1    10   100  1000
       2    25   250  2500
       1    15   150  1500
       2    30   300  3000 
', header = TRUE, stringsAsFactors = FALSE)

I actually have come up with two different solutions so far (shown below). But both solutions seem overly complex. I keep wondering whether there might be a one-liner in base R that is much better.

Here is the first solution. It combines sapply and reshape.

new.data <- do.call(cbind,sapply(unique(my.data$BB),
                 function(x) {reshape(my.data[my.data$BB == x,3:6],
                                              idvar = "CC",
                                              varying = list(2:4),
                                              v.names = "x",
                                              direction = "long")[3]}))
new.data <- data.frame(CC = rep(c(1,2), (nrow(new.data)/2)), new.data)
colnames(new.data) <- c('CC', paste0('x.', seq(1,5, by=2)))

all.equal(new.data, desired.result.2)
#[1] TRUE

Here is the second solution. It does not use sapply. However, it still requires a lot of post-processing:

new.data2 <- t(reshape(my.data,
                       idvar = c('AA','BB'),
                       timevar='CC',
                       direction = 'wide'))[3:(nrow(new.data)+2),]
new.data2 <- data.frame(CC = rep(c(1,2), each=(nrow(my.data)/2)), new.data2)
colnames(new.data2) <- c('CC', paste0('x.', seq(1,5, by=2)))
new.data2 <- apply(new.data2, 2, as.numeric)
new.data2 <- data.frame(new.data2)

all.equal(new.data2, desired.result.1)
#[1] TRUE

Upvotes: 0

Views: 103

Answers (2)

Ronak Shah
Ronak Shah

Reputation: 388807

Here's a tidyverse option to transpose the data internally within each group.

library(dplyr)
library(tidyr)

my.data %>%
  pivot_longer(cols = starts_with('X')) %>%
  arrange(CC, name) %>%
  group_by(CC, name) %>%
  mutate(row = row_number()) %>%
  group_by(CC) %>%
  mutate(value = value[order(row)]) %>%
  ungroup %>%
  select(-row) %>%
  pivot_wider() %>%
  select(-AA, -BB)

#     CC   x.1   x.3   x.5
#  <int> <int> <int> <int>
#1     1     5    50   500
#2     1    10   100  1000
#3     1    15   150  1500
#4     2    20   200  2000
#5     2    25   250  2500
#6     2    30   300  3000

Upvotes: 2

ThomasIsCoding
ThomasIsCoding

Reputation: 101044

Try the data.table option below

setDT(my.data)[
  ,
  setNames(transpose(.SD), names(.SD)),
  CC,
  .SDcols = patterns("x\\.\\d+")
]

which gives

   CC x.1 x.3  x.5
1:  1   5  50  500
2:  1  10 100 1000
3:  1  15 150 1500
4:  2  20 200 2000
5:  2  25 250 2500
6:  2  30 300 3000

Upvotes: 2

Related Questions