Reputation: 16981
I have a data set like this:
structure(list(var1 = c("APE", "APE", "APE", "APE", "APE", "APE", "GIT",
"APE", "APE", "APE", "APE", "APE", "APE", "APE", "GIT"), var2 = c("AVVAL",
"AULASU", "APALA", "AEA", "ATUPVA", "ASATAP", "ADLO", "AKOKU", "AVVAL",
"AULASU", "APALA", "AEA", "ATUPVA", "ASATAP", "ADLO"), var3 = c(NA,
NA, 1L, 101L, 17122009L, 1L, NA, 684L, NA, NA, 1L, 10L, 17122L,
1L, NA)), .Names = c("var1", "var2", "var3"), row.names = c(NA,
15L), class = "data.frame")
How can I reshape this data into wide format? I tried this
reshape(h, idvar="var2", v.names="var3", timevar="var1", direction="wide")
but it is not giving me a correct results. The correct result is:
var1 ADLO AEA AKOKU APALA ASATAP ATUPVA AULASU AVVAL
1 APE NaN 101 NA 1 1 17122009 NA NA
2 APE NaN 10 684 1 1 17122 NA NA
3 GIT NA NaN NaN NaN NaN NaN NaN NaN
4 GIT NA NaN NaN NaN NaN NaN NaN NaN
Upvotes: 0
Views: 573
Reputation: 4511
I add a new variable due to the meaning of GIT
:
dat$id <- cumsum(dat$var1=='GIT')
First I do the aggregation:
datMean <- aggregate(var3 ~ var2 * id, data=dat, FUN=mean)
> datMean
var2 id var3
1 AEA 0 101
2 APALA 0 1
3 ASATAP 0 1
4 ATUPVA 0 17122009
5 AEA 1 10
6 AKOKU 1 684
7 APALA 1 1
8 ASATAP 1 1
9 ATUPVA 1 17122
and then the conversion from long to wide:
datWide <- reshape(datMean, direction='wide', idvar='id', timevar='var2')
> datWide
id var3.AEA var3.APALA var3.ASATAP var3.ATUPVA var3.AKOKU
1 0 101 1 1 17122009 NA
5 1 10 1 1 17122 684
Upvotes: 0
Reputation: 179388
The only way I can get to your expected results is to add a new column to the data.frame. It seems to me that there is some information implicit about your data that isn't contained in the data. In other words, there must be some kind of grouping variable that identifies certain records as belonging together.
Since I can't double-guess what this information is, in my answer I am going to assume that each occurrence of GIT
marks the end of a record:
x <- grep("GIT", h$var1)
h$rec <- rep(seq_along(x), times=c(x[1], diff(x)))
library(reshape2)
mh <- melt(h, measure.vars="var3")
cast(mh, rec+var1~var2, id.var="rec", measure.var="value", fun.aggregate=mean)
rec var1 ADLO AEA AKOKU APALA ASATAP ATUPVA AULASU AVVAL
1 1 APE NaN 10 NaN 1 1 17122009 NA NA
2 1 GIT NA NaN NaN NaN NaN NaN NaN NaN
3 2 APE NaN 10 684 1 1 17122009 NA NA
4 2 GIT NA NaN NaN NaN NaN NaN NaN Na
I find the package reshape2
much easier to comprehend than the built-in reshape
function. This package provides two functions:
melt
to make a wide data.frame
tallcast
to make a tall data.frame
wideIn your case you need cast
:
library(reshape2)
cast(h, var1~var2, value="var3", fun.aggregate=mean)
var1 ADLO AEA AKOKU APALA ASATAP ATUPVA AULASU AVVAL
1 APE NaN 10 684 1 1 17122009 NA NA
2 GIT NA NaN NaN NaN NaN NaN NaN NaN
Upvotes: 3