Reputation: 1409
I'm using the cast
function to create a wide formatted data frame. I'd like to be able to control the ordering of the columns that result from using cast
. Is this possible?
In the example below, they are order: cogs_xdep, sales, sga
I would like to order them: sales, cogs_xdep, sga
The whole process actually starts with a wide formatted frame that I use melt
to change into a long format before the cast.
An example is below
>rawdata_all
coy_name gvkey_iid factor X20130930 X20130831 X20130731 X20130630 X20130531 X20130430 X20130331 X20130228 X20130131 X20121231
1 Coy 1 111111 sales 1 2 3 4 5 6 7 8 9 10
2 Coy 2 22222 sales 2 12 22 32 42 52 62 72 82 92
3 Coy 3 333333 sales 3 103 203 303 403 503 603 703 803 903
4 Coy 1 111111 cogs_xdep 4 5 6 7 8 9 10 11 12 13
5 Coy 2 22222 cogs_xdep 5 15 25 35 45 55 65 75 85 95
6 Coy 3 333333 cogs_xdep 6 106 206 306 406 506 606 706 806 906
7 Coy 1 111111 sga 7 8 9 10 11 12 13 14 15 16
8 Coy 2 22222 sga 8 18 28 38 48 58 68 78 88 98
9 Coy 3 333333 sga 9 109 209 309 409 509 609 709 809 909
...
Melt to put it in long format
> non_data_cols <- 3 # There are 3 non-value columns
> master_long <- melt(rawdata_all, id=1:non_data_cols,measured=(non_data_cols+1):length(rawdata_all))
> master_long
coy_name gvkey_iid factor variable value
1 Coy 1 111111 sales X20130930 1
2 Coy 2 22222 sales X20130930 2
3 Coy 3 333333 sales X20130930 3
4 Coy 1 111111 cogs_xdep X20130930 4
5 Coy 2 22222 cogs_xdep X20130930 5
6 Coy 3 333333 cogs_xdep X20130930 6
7 Coy 1 111111 sga X20130930 7
8 Coy 2 22222 sga X20130930 8
...
Finally cast on 'factor' to create wider data frame. (I also renamed 'variable' to 'date' and removed the 'X' from the start of the date values).
> master <- cast(master_long, ...~factor)
coy_name gvkey_iid date cogs_xdep sales sga
1 Coy 1 111111 20130930 4 1 7
2 Coy 1 111111 20130831 5 2 8
3 Coy 1 111111 20130731 6 3 9
4 Coy 1 111111 20130630 7 4 10
5 Coy 1 111111 20130531 8 5 11
6 Coy 1 111111 20130430 9 6 12
7 Coy 1 111111 20130331 10 7 13
8 Coy 1 111111 20130228 11 8 14
9 Coy 1 111111 20130131 12 9 15
10 Coy 1 111111 20121231 13 10 16
...
I would ideally like to have the final 3 columns in the following order: sales, cogs_xdep, sga. cast
appears to have arranged them alphabetically as you can see they are ordered in the desired way in both the original data frame and the long formatted data frame.
Any suggestions would be greatly appreciated. While it is easier enough to rearrange the columns with only 3, it's more cumbersome in the real situation of 30+ columns.
Thanks,
Upvotes: 3
Views: 4921
Reputation: 946
I found here: dcast - order factors how you can tell dcast do not change the order of factors alphabetically.
dff<-structure(list(ID_no = c(2, 2, 2, 2, 2, 29, 29, 29, 29, 29),
Variable = structure(c(1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L),
.Label = c("q1", "q2" ), class = "factor"),
Levels = structure(c(1L, 2L, 3L, 4L, 5L, 1L, 2L, 3L, 4L, 5L),
.Label = c("vf1", "df2", "fd3", "re4", "fr5" ), class = "factor"),
Frequency = c(2, 15, 41, 28, 20, 6, 0, 5, 27, 82),
Percentage = c(1.89, 14.15, 38.68, 26.42, 18.87, 5, 0, 4.17, 22.5, 68.33)),
.Names = c("ID_no", "Variable", "Levels", "Frequency", "Percentage"),
row.names = c(NA, -10L), class = c("data.table", "data.frame"))
ID_no Variable Levels Frequency Percentage
1: 2 q1 vf1 2 1.89
2: 2 q1 df2 15 14.15
3: 2 q1 fd3 41 38.68
4: 2 q1 re4 28 26.42
5: 2 q1 fr5 20 18.87
6: 29 q2 vf1 6 5.00
7: 29 q2 df2 0 0.00
8: 29 q2 fd3 5 4.17
9: 29 q2 re4 27 22.50
10: 29 q2 fr5 82 68.33
dcast(setDT(dff), Variable~factor(Levels, levels=unique(Levels)), value.var=c("Percentage"))
Some random factor levels, with their number to indicate their order.
Variable vf1 df2 fd3 re4 fr5
1: q1 1.89 14.15 38.68 26.42 18.87
2: q2 5.00 0.00 4.17 22.50 68.33
Hope this help.
Upvotes: 0
Reputation: 132746
I don't see a reason, why column order should matter. However, you can always change the order like this:
master[,c(names(master)[1:3], as.character(unique(master_long$factor)))]
coy_name gvkey_iid variable sales cogs_xdep sga
1 Coy_1 111111 X20130930 1 4 7
2 Coy_1 111111 X20130831 2 5 8
3 Coy_1 111111 X20130731 3 6 9
4 Coy_1 111111 X20130630 4 7 10
5 Coy_1 111111 X20130531 5 8 11
6 Coy_1 111111 X20130430 6 9 12
7 Coy_1 111111 X20130331 7 10 13
8 Coy_1 111111 X20130228 8 11 14
9 Coy_1 111111 X20130131 9 12 15
10 Coy_1 111111 X20121231 10 13 16
11 Coy_2 22222 X20130930 2 5 8
12 Coy_2 22222 X20130831 12 15 18
13 Coy_2 22222 X20130731 22 25 28
14 Coy_2 22222 X20130630 32 35 38
15 Coy_2 22222 X20130531 42 45 48
16 Coy_2 22222 X20130430 52 55 58
17 Coy_2 22222 X20130331 62 65 68
18 Coy_2 22222 X20130228 72 75 78
19 Coy_2 22222 X20130131 82 85 88
20 Coy_2 22222 X20121231 92 95 98
21 Coy_3 333333 X20130930 3 6 9
22 Coy_3 333333 X20130831 103 106 109
23 Coy_3 333333 X20130731 203 206 209
24 Coy_3 333333 X20130630 303 306 309
25 Coy_3 333333 X20130531 403 406 409
26 Coy_3 333333 X20130430 503 506 509
27 Coy_3 333333 X20130331 603 606 609
28 Coy_3 333333 X20130228 703 706 709
29 Coy_3 333333 X20130131 803 806 809
30 Coy_3 333333 X20121231 903 906 909
Note that package reshape2 has mostly superseded reshape.
Upvotes: 2