Leprechault
Leprechault

Reputation: 1823

Problem with rearrange dataframe using melt/cast

I've like to use cast function (reshape package) for returns to original dataframe state, but doesn't work. In my example:

#First simulate some data
set.seed(123)
bands=5
data <- data.frame(matrix(runif(10*bands),ncol=bands))
colnames(data) <- paste0(1:bands)
data$nitrogen <- rpois(10,10)
data$Class<-rep("test",10)
#

# Reshape with melt function 
library(reshape)
data2 <- melt(data, id=c("nitrogen","Class"))

#Return to original data state again
data3<-cast(data2, Class+nitrogen~variable)
data3
  Class nitrogen 1 2 3 4 5
1  test        4 1 1 1 1 1
2  test        5 1 1 1 1 1
3  test        6 2 2 2 2 2
4  test        8 1 1 1 1 1
5  test       11 1 1 1 1 1
6  test       12 4 4 4 4 4

I expected:

           1          2         3          4         5 nitrogen Class
1  0.2875775 0.95683335 0.8895393 0.96302423 0.1428000        4  test
2  0.7883051 0.45333416 0.6928034 0.90229905 0.4145463        6  test
3  0.4089769 0.67757064 0.6405068 0.69070528 0.4137243        6  test
...
10 0.4566147 0.95450365 0.1471136 0.23162579 0.8578277       11  test

Doesn't work my cast approach for return to data object, any member help me please? Thanks

Upvotes: 0

Views: 38

Answers (1)

Mike O&#39;Brien
Mike O&#39;Brien

Reputation: 156

The combinations of Class and nitrogen are not unique: there are four times where Class == "test" and nitrogen == 11, two times where Class == "test" and nitrogen == 6. From the help documentation:

If the combination of variables you supply does not uniquely identify one row in the original data set, you will need to supply an aggregating function, fun.aggregate. This function should take a vector of numbers and return a summary statistic(s).

So, cast will aggregate the repeated combinations. You won't be able to get back to the original data unless you put in some sort of dummy variable that makes the combinations unique.

data$dummy <- 1:10

data3 <- cast(data2, Class + nitrogen + dummy ~ variable)
data3
   Class nitrogen dummy         1          2         3          4         5
1   test        4     1 0.2875775 0.95683335 0.8895393 0.96302423 0.1428000
2   test        5     5 0.9404673 0.10292468 0.6557058 0.02461368 0.1524447
3   test        6     2 0.7883051 0.45333416 0.6928034 0.90229905 0.4145463
4   test        6     3 0.4089769 0.67757064 0.6405068 0.69070528 0.4137243
5   test        8     4 0.8830174 0.57263340 0.9942698 0.79546742 0.3688455
6   test       11    10 0.4566147 0.95450365 0.1471136 0.23162579 0.8578277
7   test       12     6 0.0455565 0.89982497 0.7085305 0.47779597 0.1388061
8   test       12     7 0.5281055 0.24608773 0.5440660 0.75845954 0.2330341
9   test       12     8 0.8924190 0.04205953 0.5941420 0.21640794 0.4659625
10  test       12     9 0.5514350 0.32792072 0.2891597 0.31818101 0.2659726

Change the order of columns and drop the dummy if you want it to be exactly the same.

data3[,c(4:8, 2, 1)]
           1          2         3          4         5 nitrogen Class
1  0.2875775 0.95683335 0.8895393 0.96302423 0.1428000        4  test
2  0.9404673 0.10292468 0.6557058 0.02461368 0.1524447        5  test
3  0.7883051 0.45333416 0.6928034 0.90229905 0.4145463        6  test
4  0.4089769 0.67757064 0.6405068 0.69070528 0.4137243        6  test
5  0.8830174 0.57263340 0.9942698 0.79546742 0.3688455        8  test
6  0.4566147 0.95450365 0.1471136 0.23162579 0.8578277       11  test
7  0.0455565 0.89982497 0.7085305 0.47779597 0.1388061       12  test
8  0.5281055 0.24608773 0.5440660 0.75845954 0.2330341       12  test
9  0.8924190 0.04205953 0.5941420 0.21640794 0.4659625       12  test
10 0.5514350 0.32792072 0.2891597 0.31818101 0.2659726       12  test

Upvotes: 1

Related Questions