How to reshape data into wide format?

Question

I have a data set like this:

structure(list(var1 = c("APE", "APE", "APE", "APE", "APE", "APE", "GIT",
"APE", "APE", "APE", "APE", "APE", "APE", "APE", "GIT"), var2 = c("AVVAL",
"AULASU", "APALA", "AEA", "ATUPVA", "ASATAP", "ADLO", "AKOKU", "AVVAL",
"AULASU", "APALA", "AEA", "ATUPVA", "ASATAP", "ADLO"), var3 = c(NA,
NA, 1L, 101L, 17122009L, 1L, NA, 684L, NA, NA, 1L, 10L, 17122L,
1L, NA)), .Names = c("var1", "var2", "var3"), row.names = c(NA,
15L), class = "data.frame")

How can I reshape this data into wide format? I tried this

reshape(h, idvar="var2", v.names="var3", timevar="var1", direction="wide")

but it is not giving me a correct results. The correct result is:

  var1 ADLO AEA AKOKU APALA ASATAP   ATUPVA AULASU AVVAL
1  APE  NaN 101   NA      1      1 17122009     NA    NA
2  APE  NaN  10   684     1      1    17122     NA    NA
3  GIT   NA NaN   NaN   NaN    NaN      NaN    NaN   NaN
4  GIT   NA NaN   NaN   NaN    NaN      NaN    NaN   NaN

Andrie · Accepted Answer

Edited

The only way I can get to your expected results is to add a new column to the data.frame. It seems to me that there is some information implicit about your data that isn't contained in the data. In other words, there must be some kind of grouping variable that identifies certain records as belonging together.

Since I can't double-guess what this information is, in my answer I am going to assume that each occurrence of GIT marks the end of a record:

x <- grep("GIT", h$var1)
h$rec <- rep(seq_along(x), times=c(x[1], diff(x)))

library(reshape2)
mh <- melt(h, measure.vars="var3")
cast(mh, rec+var1~var2, id.var="rec", measure.var="value", fun.aggregate=mean)

  rec var1 ADLO AEA AKOKU APALA ASATAP   ATUPVA AULASU AVVAL
1   1  APE  NaN  10   NaN     1      1 17122009     NA    NA
2   1  GIT   NA NaN   NaN   NaN    NaN      NaN    NaN   NaN
3   2  APE  NaN  10   684     1      1 17122009     NA    NA
4   2  GIT   NA NaN   NaN   NaN    NaN      NaN    NaN   Na

Original answer

I find the package reshape2 much easier to comprehend than the built-in reshape function. This package provides two functions:

melt to make a wide data.frame tall
cast to make a tall data.frame wide

In your case you need cast:

library(reshape2)
cast(h, var1~var2, value="var3", fun.aggregate=mean)

  var1 ADLO AEA AKOKU APALA ASATAP   ATUPVA AULASU AVVAL
1  APE  NaN  10   684     1      1 17122009     NA    NA
2  GIT   NA NaN   NaN   NaN    NaN      NaN    NaN   NaN

How to reshape data into wide format?

Answers (2)

Edited

Original answer

Related Questions