Reputation: 13103
I have a data set, my.data
, with multiple columns of numeric data (x.1
, x.3
and x.5
) and two groups (CC
).
my.data <- read.table(text = '
AA BB CC x.1 x.3 x.5
aaa 3 1 5 10 15
aaa 3 2 20 25 30
aaa 4 1 50 100 150
aaa 4 2 200 250 300
aaa 5 1 500 1000 1500
aaa 5 2 2000 2500 3000
', header = TRUE, stringsAsFactors = FALSE)
I want to reshape x.1
, x.3
and x.5
by CC
. Either of the following two data sets is ideal. I realize this is probably a duplicate, but I have looked at probably two dozen reshape
questions so far without finding a similar question.
desired.result.1 <- read.table(text = '
CC x.1 x.3 x.5
1 5 50 500
1 10 100 1000
1 15 150 1500
2 20 200 2000
2 25 250 2500
2 30 300 3000
', header = TRUE, stringsAsFactors = FALSE)
desired.result.2 <- read.table(text = '
CC x.1 x.3 x.5
1 5 50 500
2 20 200 2000
1 10 100 1000
2 25 250 2500
1 15 150 1500
2 30 300 3000
', header = TRUE, stringsAsFactors = FALSE)
I actually have come up with two different solutions so far (shown below). But both solutions seem overly complex. I keep wondering whether there might be a one-liner in base R
that is much better.
Here is the first solution. It combines sapply
and reshape
.
new.data <- do.call(cbind,sapply(unique(my.data$BB),
function(x) {reshape(my.data[my.data$BB == x,3:6],
idvar = "CC",
varying = list(2:4),
v.names = "x",
direction = "long")[3]}))
new.data <- data.frame(CC = rep(c(1,2), (nrow(new.data)/2)), new.data)
colnames(new.data) <- c('CC', paste0('x.', seq(1,5, by=2)))
all.equal(new.data, desired.result.2)
#[1] TRUE
Here is the second solution. It does not use sapply
. However, it still requires a lot of post-processing:
new.data2 <- t(reshape(my.data,
idvar = c('AA','BB'),
timevar='CC',
direction = 'wide'))[3:(nrow(new.data)+2),]
new.data2 <- data.frame(CC = rep(c(1,2), each=(nrow(my.data)/2)), new.data2)
colnames(new.data2) <- c('CC', paste0('x.', seq(1,5, by=2)))
new.data2 <- apply(new.data2, 2, as.numeric)
new.data2 <- data.frame(new.data2)
all.equal(new.data2, desired.result.1)
#[1] TRUE
Upvotes: 0
Views: 103
Reputation: 388807
Here's a tidyverse
option to transpose the data internally within each group.
library(dplyr)
library(tidyr)
my.data %>%
pivot_longer(cols = starts_with('X')) %>%
arrange(CC, name) %>%
group_by(CC, name) %>%
mutate(row = row_number()) %>%
group_by(CC) %>%
mutate(value = value[order(row)]) %>%
ungroup %>%
select(-row) %>%
pivot_wider() %>%
select(-AA, -BB)
# CC x.1 x.3 x.5
# <int> <int> <int> <int>
#1 1 5 50 500
#2 1 10 100 1000
#3 1 15 150 1500
#4 2 20 200 2000
#5 2 25 250 2500
#6 2 30 300 3000
Upvotes: 2
Reputation: 101044
Try the data.table
option below
setDT(my.data)[
,
setNames(transpose(.SD), names(.SD)),
CC,
.SDcols = patterns("x\\.\\d+")
]
which gives
CC x.1 x.3 x.5
1: 1 5 50 500
2: 1 10 100 1000
3: 1 15 150 1500
4: 2 20 200 2000
5: 2 25 250 2500
6: 2 30 300 3000
Upvotes: 2