Thomas
Thomas

Reputation: 2534

Reshape date columns in R

I'm trying to reshape data from wide to long format in R. In wide form, my columns are dates, and I want to put these dates into rows. My data are

structure(list(A = c(1L, 4L, 3L), B = structure(c(1L, 3L, 2L), .Label = c("Site1", 
"Site2", "Site3"), class = "factor"), C = c(1L, 4L, 3L), D = c(7L, 
5L, 1L), X4.5.2013 = c(4L, 6L, 8L), X7.5.2013 = c(2L, 5L, 7L), 
X10.5.2013 = c(5L, 4L, 8L)), .Names = c("A", "B", "C", "D", 
"X4.5.2013", "X7.5.2013", "X10.5.2013"), class = "data.frame", row.names = c(NA, 
-3L))

which gives this dataframe

  A     B C D X4.5.2013 X7.5.2013 X10.5.2013
1 1 Site1 1 7         4         2          5
2 4 Site3 4 5         6         5          4
3 3 Site2 3 1         8         7          8

Note that here, the columns are dates, so that for example X4.5.2013 corresponds to the date 4/5/2013. I want to reshape these data to look like

structure(list(A = c(1L, 1L, 1L, 4L, 4L, 4L, 3L, 3L, 3L), B = structure(c(1L, 
1L, 1L, 3L, 3L, 3L, 2L, 2L, 2L), .Label = c("Site1", "Site2", 
"Site3"), class = "factor"), C = c(1L, 1L, 1L, 4L, 4L, 4L, 3L, 
3L, 3L), D = c(7L, 7L, 7L, 5L, 5L, 5L, 1L, 1L, 1L), Date = structure(c(2L, 
3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L), .Label = c("10/5/2013", "4/5/2013", 
"7/5/2013"), class = "factor"), E = c(4L, 2L, 5L, 6L, 5L, 4L, 
8L, 7L, 8L)), .Names = c("A", "B", "C", "D", "Date", "E"), class = "data.frame", row.names = c(NA, 
 -9L))

or

  A     B C D      Date E
1 1 Site1 1 7  4/5/2013 4
2 1 Site1 1 7  7/5/2013 2
3 1 Site1 1 7 10/5/2013 5
4 4 Site3 4 5  4/5/2013 6
5 4 Site3 4 5  7/5/2013 5
6 4 Site3 4 5 10/5/2013 4
7 3 Site2 3 1  4/5/2013 8
8 3 Site2 3 1  7/5/2013 7
9 3 Site2 3 1 10/5/2013 8

I've been reading online and it seems I want to use the "reshape" command. I've tried this

reshape(df1, varying = list(c("X4.5.2013", "X7.5.2013", "X10.5.2013")),
             idvar = "D", direction = "long")

which results in

    A     B C D time X4.5.2013
7.1 1 Site1 1 7    1         4
5.1 4 Site3 4 5    1         6
1.1 3 Site2 3 1    1         8
7.2 1 Site1 1 7    2         2
5.2 4 Site3 4 5    2         5
1.2 3 Site2 3 1    2         7
7.3 1 Site1 1 7    3         5
5.3 4 Site3 4 5    3         4
1.3 3 Site2 3 1    3         8

This seems to work somewhat, however it gives me a "time" column when I instead want these times to be the dates from the column headings in the original dataframe. Does anyone know how I can use the column headings for this "time" column -- but instead calling this column "Date" and converting the column headings into date formats, as in my second dataframe posted above?

Thanks!

EDIT: After playing with it a bit more, I found that

reshape(df1, varying=list(c("X4.5.2013", "X7.5.2013", "X10.5.2013")), times=names(df1[5:7]), timevar = "Date", idvar="D", direction = "long")

results in

             A     B C D       Date X4.5.2013
7.X4.5.2013  1 Site1 1 7  X4.5.2013         4
5.X4.5.2013  4 Site3 4 5  X4.5.2013         6
1.X4.5.2013  3 Site2 3 1  X4.5.2013         8
7.X7.5.2013  1 Site1 1 7  X7.5.2013         2
5.X7.5.2013  4 Site3 4 5  X7.5.2013         5
1.X7.5.2013  3 Site2 3 1  X7.5.2013         7
7.X10.5.2013 1 Site1 1 7 X10.5.2013         5
5.X10.5.2013 4 Site3 4 5 X10.5.2013         4
1.X10.5.2013 3 Site2 3 1 X10.5.2013         8

This is very close to what I want, but is still not perfect. Here are the issues I still have:

1) How can I change the name of the last column from "X4.5.2013" to "E"? Is there a way in the reshape command, or do I have to do this with a separate command after reshaping? 2) Why are the indexes in this last dataframe of the format "7.X4.5.2013" (for example)?

Upvotes: 2

Views: 1889

Answers (1)

IRTFM
IRTFM

Reputation: 263311

Use the time variable to select from the vector of columnnames where df2 is the result of the first reshape command that you offered.

df2$time <- as.Date( 
              sub("X","", c("X4.5.2013", "X7.5.2013", "X10.5.2013"))[
                                                             df2$time], 
                    format="%m.%d.%Y")
df2

    A     B C D       time X4.5.2013
7.1 1 Site1 1 7 2013-04-05         4
5.1 4 Site3 4 5 2013-04-05         6
1.1 3 Site2 3 1 2013-04-05         8
7.2 1 Site1 1 7 2013-07-05         2
5.2 4 Site3 4 5 2013-07-05         5
1.2 3 Site2 3 1 2013-07-05         7
7.3 1 Site1 1 7 2013-10-05         5
5.3 4 Site3 4 5 2013-10-05         4
1.3 3 Site2 3 1 2013-10-05         8

If you prefer to work on the resutls of the second reshape operation (which I named df3) then do this:

df3$Date <- as.Date( sub("X","", df3$Date) , format="%m.%d.%Y")

Upvotes: 2

Related Questions