Reputation: 113
Here is a sample dataframe I have
mydataf.tickersymbol mydataf.yr_qtr mydataf.act_mean
1 ABC 20084 BB
2 ABC 20091 BB
3 ABC 20092 BB
4 ABC 20093 BB
5 DEF 20084 BB
6 DEF 20091 BB
7 DEF 20092 BB
8 DEF 20093 BB
9 DEF 20094 BB
10 GEF 20092 BB
11 GEF 20093 BB
12 GEF 20094 M
Trying to get to output of by joining on yr_qtr
for each ticker and caring for the missing data valued. I can go through and loop this and create the logic for all the checks but seems like through data frames I should be able to do this using r dataframe capabilities?
20084 20091 20092 20093 20094
ABC AA BB BB BB
DEF BB BB BB BB BM
GEF BB BB M
Upvotes: 2
Views: 846
Reputation: 21641
I would simply do:
library(tidyr)
df %>% spread(mydataf.yr_qtr, mydataf.act_mean)
Or using the reshape2
package:
library(reshape2)
dcast(df, mydataf.tickersymbol ~ mydataf.yr_qtr,
value.var = "mydataf.act_mean")
Which gives:
mydataf.tickersymbol 20084 20091 20092 20093 20094
1 ABC BB BB BB BB <NA>
2 DEF BB BB BB BB BB
3 GEF <NA> <NA> BB BB M
Or if we need the "mydataf.tickersymbol" as row names, use acast
which will give a matrix output and then convert back to data.frame with as.data.frame
library(reshape)#reshape2_1.4
as.data.frame(acast(df, mydataf.tickersymbol~mydataf.yr_qtr,
value.var='mydataf.act_mean', fill=''))
# 20084 20091 20092 20093 20094
# ABC BB BB BB BB
# DEF BB BB BB BB BB
# GEF BB BB M
Upvotes: 5
Reputation: 35324
It looks like you're trying to reshape from long format to wide, using 'mydataf.tickersymbol'
as the key (aka idvar) and 'mydataf.yr_qtr'
as the time variable (can be thought of as a discriminator column that distinguishes wide-format columns). However, the data in your question is slightly inconsistent, e.g. 20094/DEF should be BB rather than BM, and 20084/ABC should be BB rather than AA, no? In any case, I think this is what you're looking for:
df <- data.frame(mydataf.tickersymbol=c('ABC','ABC','ABC','ABC','DEF','DEF','DEF','DEF','DEF','GEF','GEF','GEF'), mydataf.yr_qtr=c(20084,20091,20092,20093,20084,20091,20092,20093,20094,20092,20093,20094), mydataf.act_mean=c('BB','BB','BB','BB','BB','BB','BB','BB','BB','BB','BB','M') );
wide <- reshape(df,dir='w',idvar='mydataf.tickersymbol',timevar='mydataf.yr_qtr');
names(wide) <- sub('^mydataf\\.act_mean\\.','',names(wide));
wide;
## mydataf.tickersymbol 20084 20091 20092 20093 20094
## 1 ABC BB BB BB BB <NA>
## 5 DEF BB BB BB BB BB
## 10 GEF <NA> <NA> BB BB M
Upvotes: 3