Reputation: 11
I'm dealing with a complicated wide table looks like this:
date hour type A B C D E
20100202 0 x 1 2 3 4 5
20100202 0 y 6 7 8 9 10
20100202 1 x 11 12 13 14 15
20100202 1 y 16 17 18 19 20
The format I'm looking for is:
date hour object x y
20100202 0 A 1 6
20100202 0 B 2 7
20100202 0 C 3 8
20100202 0 D 4 9
20100202 0 E 5 10
20100202 1 A 11 16
20100202 1 B 12 17
20100202 1 C 13 18
20100202 1 D 14 19
20100202 1 E 15 20
I've been looking the similar questions that have been proposed before, but I didn't find much useful solution (although I'm pretty sure that's because I'm pretty new to this language)
Any suggestions or help would be appreciated!
Upvotes: 1
Views: 104
Reputation: 388907
A tidyverse
way would using pivot_longer
and pivot_wider
:
library(tidyr)
df %>%
pivot_longer(cols = A:E) %>%
pivot_wider(names_from = type, values_from = value)
# date hour name x y
# <int> <int> <chr> <int> <int>
# 1 20100202 0 A 1 6
# 2 20100202 0 B 2 7
# 3 20100202 0 C 3 8
# 4 20100202 0 D 4 9
# 5 20100202 0 E 5 10
# 6 20100202 1 A 11 16
# 7 20100202 1 B 12 17
# 8 20100202 1 C 13 18
# 9 20100202 1 D 14 19
#10 20100202 1 E 15 20
data
df <- structure(list(date = c(20100202L, 20100202L, 20100202L, 20100202L
), hour = c(0L, 0L, 1L, 1L), type = c("x", "y", "x", "y"), A = c(1L,
6L, 11L, 16L), B = c(2L, 7L, 12L, 17L), C = c(3L, 8L, 13L, 18L
), D = c(4L, 9L, 14L, 19L), E = c(5L, 10L, 15L, 20L)),
class = "data.frame", row.names = c(NA, -4L))
Upvotes: 1
Reputation: 34703
In data.table
, this is done with melt
and dcast
:
library(data.table)
DT = fread("date hour type A B C D E
20100202 0 x 1 2 3 4 5
20100202 0 y 6 7 8 9 10
20100202 1 x 11 12 13 14 15
20100202 1 y 16 17 18 19 20")
Two steps:
DT[ , melt(.SD, id.vars = c('date', 'hour', 'type'),
measure.vars = LETTERS[1:5], variable.name='object')
][ , dcast(.SD, date + hour + object ~ type, value.var = 'value')]
# date hour object x y
# 1: 20100202 0 A 1 6
# 2: 20100202 0 B 2 7
# 3: 20100202 0 C 3 8
# 4: 20100202 0 D 4 9
# 5: 20100202 0 E 5 10
# 6: 20100202 1 A 11 16
# 7: 20100202 1 B 12 17
# 8: 20100202 1 C 13 18
# 9: 20100202 1 D 14 19
# 10: 20100202 1 E 15 20
melt
reshapes your data long. id.vars
says the variables to "stretch" and measure.vars
says which variables will be "transposed" into a single column.
.SD
, see here or the corresponding vignette. I use it here to facilitate chaining. See below for an equivalent version without .SD
.variable.name
is used to get object
in the output as you specified. By default it would be named variable
instead.dcast
is used to reshape wide again, in the different arrangement requested. the LHS of ~
gives the variables that will remain columns, the RHS says which column's values will become columns (here the values of type
, x
and y
, become columns. The formula gives the shape of the output table; value.var
says which variable will "fill out" the shape implied by the formula.Upvotes: 4