Reputation: 109844
I love the reshape2 package because it made life so doggone easy. Typically Hadley has made improvements in his previous packages that enable streamlined, faster running code. I figured I'd give tidyr a whirl and from what I read I thought gather
was very similar to melt
from reshape2. But after reading the documentation I can't get gather
to do the same task that melt
does.
Data View
Here's a view of the data (actual data in dput
form at end of post):
teacher yr1.baseline pd yr1.lesson1 yr1.lesson2 yr2.lesson1 yr2.lesson2 yr2.lesson3
1 3 1/13/09 2/5/09 3/6/09 4/27/09 10/7/09 11/18/09 3/4/10
2 7 1/15/09 2/5/09 3/3/09 5/5/09 10/16/09 11/18/09 3/4/10
3 8 1/27/09 2/5/09 3/3/09 4/27/09 10/7/09 11/18/09 3/5/10
Code
Here's the code in melt
fashion, my attempt at gather
. How can I make gather
do the same thing as melt
?
library(reshape2); library(dplyr); library(tidyr)
dat %>%
melt(id=c("teacher", "pd"), value.name="date")
dat %>%
gather(key=c(teacher, pd), value=date, -c(teacher, pd))
Desired Output
teacher pd variable date
1 3 2/5/09 yr1.baseline 1/13/09
2 7 2/5/09 yr1.baseline 1/15/09
3 8 2/5/09 yr1.baseline 1/27/09
4 3 2/5/09 yr1.lesson1 3/6/09
5 7 2/5/09 yr1.lesson1 3/3/09
6 8 2/5/09 yr1.lesson1 3/3/09
7 3 2/5/09 yr1.lesson2 4/27/09
8 7 2/5/09 yr1.lesson2 5/5/09
9 8 2/5/09 yr1.lesson2 4/27/09
10 3 2/5/09 yr2.lesson1 10/7/09
11 7 2/5/09 yr2.lesson1 10/16/09
12 8 2/5/09 yr2.lesson1 10/7/09
13 3 2/5/09 yr2.lesson2 11/18/09
14 7 2/5/09 yr2.lesson2 11/18/09
15 8 2/5/09 yr2.lesson2 11/18/09
16 3 2/5/09 yr2.lesson3 3/4/10
17 7 2/5/09 yr2.lesson3 3/4/10
18 8 2/5/09 yr2.lesson3 3/5/10
Data
dat <- data.frame(
teacher = factor(c("3", "7", "8")),
yr1.baseline = factor(c("1/13/09", "1/15/09", "1/27/09")),
pd = factor(c("2/5/09", "2/5/09", "2/5/09")),
yr1.lesson1 = factor(c("3/6/09", "3/3/09", "3/3/09")),
yr1.lesson2 = factor(c("4/27/09", "5/5/09", "4/27/09")),
yr2.lesson1 = factor(c("10/7/09", "10/16/09", "10/7/09")),
yr2.lesson2 = factor(c("11/18/09", "11/18/09", "11/18/09")),
yr2.lesson3 = factor(c("3/4/10", "3/4/10", "3/5/10"))
)
Upvotes: 73
Views: 33400
Reputation: 8601
In tidyr 1.0.0 this task is accomplished with the more flexible pivot_longer()
.
The equivalent syntax would be
library(tidyr)
dat %>% pivot_longer(cols = -c(teacher, pd), names_to = "variable", values_to = "date")
which says, correspondingly, "pivot everything longer except teacher
and pd
, calling the new variable column "variable" and the new value column "date".
Note that the long data comes back in order firstly of the columns of the previous data frame that were pivoted, unlike from gather
, which came back in the order of the new variable column. To rearrange the resultant tibble, use dplyr::arrange()
.
Upvotes: 13
Reputation: 78590
Your gather
line should look like:
dat %>% gather(variable, date, -teacher, -pd)
This says "Gather all variables except teacher
and pd
, calling the new key column 'variable' and the new value column 'date'."
As an explanation, note the following from the help(gather)
page:
...: Specification of columns to gather. Use bare variable names.
Select all variables between x and z with ‘x:z’, exclude y
with ‘-y’. For more options, see the select documentation.
Since this is an ellipsis, the specification of columns to gather is given as separate (bare name) arguments. We wish to gather all columns except teacher
and pd
, so we use -
.
Upvotes: 91