Reputation: 2000
I'm sure this is a simple question, but I'm not sure what to search for or how to solve it.
My data looks something like this:
Date UserID Var1 Var2
2013-01 1 2 3
2013-02 1 2 1
etc.
Where the dates that are recorded are different for each user.
I want to find statistics about one of the variables (let's say Var2), and in order to do that I want to reshape the data, so that the output will be something like:
ID Time1 Time2 Time3 ...
1 3 1 NA
etc.
where Time1 is the first date that a user appears in the data, Time2 is the next, etc.
Upvotes: 0
Views: 96
Reputation: 6267
(edited to handle the re-alignment of dates on a relative scale)
What about using reshape
?
This is to input the data:
> zz <- "Date UserID Var1 Var2
+ 2013-01 1 2 3
+ 2013-02 1 2 1
+ 2013-02 2 2 1"
> Data <- read.table(text=zz, header = TRUE)
This is to reshape:
Data2 = reshape(Data[c(1,2,4)],direction='wide',timevar='Date',idvar='UserID')
Resulting in:
> Data2
UserID Var2.2013-01 Var2.2013-02
1 1 3 1
3 2 NA 1
(note that c(1,2,4)
refers to the colum numbers corresponding to Date
, UserID
and Var2
)
Finally, to obtain the desired output, you can simply move the NA
to the end of each line:
t(sapply(1:nrow(Data2), function(x) c(Data2[x,!is.na(Data2[x,])],rep(NA,sum(is.na(Data2[x,]))))))
Resulting in:
UserID Var2.2013-01 Var2.2013-02
[1,] 1 3 1
[2,] 2 1 NA
Note that the column names here are not current anymore. Also, you may possibly have columns containing only NA on the right of your matrix... I will let you handle these small details.
Upvotes: 1