Reputation: 2534
I'm trying to reshape data from wide to long format in R. In wide form, my columns are dates, and I want to put these dates into rows. My data are
structure(list(A = c(1L, 4L, 3L), B = structure(c(1L, 3L, 2L), .Label = c("Site1",
"Site2", "Site3"), class = "factor"), C = c(1L, 4L, 3L), D = c(7L,
5L, 1L), X4.5.2013 = c(4L, 6L, 8L), X7.5.2013 = c(2L, 5L, 7L),
X10.5.2013 = c(5L, 4L, 8L)), .Names = c("A", "B", "C", "D",
"X4.5.2013", "X7.5.2013", "X10.5.2013"), class = "data.frame", row.names = c(NA,
-3L))
which gives this dataframe
A B C D X4.5.2013 X7.5.2013 X10.5.2013
1 1 Site1 1 7 4 2 5
2 4 Site3 4 5 6 5 4
3 3 Site2 3 1 8 7 8
Note that here, the columns are dates, so that for example X4.5.2013 corresponds to the date 4/5/2013. I want to reshape these data to look like
structure(list(A = c(1L, 1L, 1L, 4L, 4L, 4L, 3L, 3L, 3L), B = structure(c(1L,
1L, 1L, 3L, 3L, 3L, 2L, 2L, 2L), .Label = c("Site1", "Site2",
"Site3"), class = "factor"), C = c(1L, 1L, 1L, 4L, 4L, 4L, 3L,
3L, 3L), D = c(7L, 7L, 7L, 5L, 5L, 5L, 1L, 1L, 1L), Date = structure(c(2L,
3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L), .Label = c("10/5/2013", "4/5/2013",
"7/5/2013"), class = "factor"), E = c(4L, 2L, 5L, 6L, 5L, 4L,
8L, 7L, 8L)), .Names = c("A", "B", "C", "D", "Date", "E"), class = "data.frame", row.names = c(NA,
-9L))
or
A B C D Date E
1 1 Site1 1 7 4/5/2013 4
2 1 Site1 1 7 7/5/2013 2
3 1 Site1 1 7 10/5/2013 5
4 4 Site3 4 5 4/5/2013 6
5 4 Site3 4 5 7/5/2013 5
6 4 Site3 4 5 10/5/2013 4
7 3 Site2 3 1 4/5/2013 8
8 3 Site2 3 1 7/5/2013 7
9 3 Site2 3 1 10/5/2013 8
I've been reading online and it seems I want to use the "reshape" command. I've tried this
reshape(df1, varying = list(c("X4.5.2013", "X7.5.2013", "X10.5.2013")),
idvar = "D", direction = "long")
which results in
A B C D time X4.5.2013
7.1 1 Site1 1 7 1 4
5.1 4 Site3 4 5 1 6
1.1 3 Site2 3 1 1 8
7.2 1 Site1 1 7 2 2
5.2 4 Site3 4 5 2 5
1.2 3 Site2 3 1 2 7
7.3 1 Site1 1 7 3 5
5.3 4 Site3 4 5 3 4
1.3 3 Site2 3 1 3 8
This seems to work somewhat, however it gives me a "time" column when I instead want these times to be the dates from the column headings in the original dataframe. Does anyone know how I can use the column headings for this "time" column -- but instead calling this column "Date" and converting the column headings into date formats, as in my second dataframe posted above?
Thanks!
EDIT: After playing with it a bit more, I found that
reshape(df1, varying=list(c("X4.5.2013", "X7.5.2013", "X10.5.2013")), times=names(df1[5:7]), timevar = "Date", idvar="D", direction = "long")
results in
A B C D Date X4.5.2013
7.X4.5.2013 1 Site1 1 7 X4.5.2013 4
5.X4.5.2013 4 Site3 4 5 X4.5.2013 6
1.X4.5.2013 3 Site2 3 1 X4.5.2013 8
7.X7.5.2013 1 Site1 1 7 X7.5.2013 2
5.X7.5.2013 4 Site3 4 5 X7.5.2013 5
1.X7.5.2013 3 Site2 3 1 X7.5.2013 7
7.X10.5.2013 1 Site1 1 7 X10.5.2013 5
5.X10.5.2013 4 Site3 4 5 X10.5.2013 4
1.X10.5.2013 3 Site2 3 1 X10.5.2013 8
This is very close to what I want, but is still not perfect. Here are the issues I still have:
1) How can I change the name of the last column from "X4.5.2013" to "E"? Is there a way in the reshape command, or do I have to do this with a separate command after reshaping? 2) Why are the indexes in this last dataframe of the format "7.X4.5.2013" (for example)?
Upvotes: 2
Views: 1889
Reputation: 263311
Use the time
variable to select from the vector of columnnames where df2
is the result of the first reshape command that you offered.
df2$time <- as.Date(
sub("X","", c("X4.5.2013", "X7.5.2013", "X10.5.2013"))[
df2$time],
format="%m.%d.%Y")
df2
A B C D time X4.5.2013
7.1 1 Site1 1 7 2013-04-05 4
5.1 4 Site3 4 5 2013-04-05 6
1.1 3 Site2 3 1 2013-04-05 8
7.2 1 Site1 1 7 2013-07-05 2
5.2 4 Site3 4 5 2013-07-05 5
1.2 3 Site2 3 1 2013-07-05 7
7.3 1 Site1 1 7 2013-10-05 5
5.3 4 Site3 4 5 2013-10-05 4
1.3 3 Site2 3 1 2013-10-05 8
If you prefer to work on the resutls of the second reshape operation (which I named df3
) then do this:
df3$Date <- as.Date( sub("X","", df3$Date) , format="%m.%d.%Y")
Upvotes: 2