Gina Zetkin
Gina Zetkin

Reputation: 333

Nested reshape from wide to long

I keep on getting all sort of error messages when trying to reshape an object into long direction. Toy data:

d <- structure(c(0.204, 0.036, 0.015, 0.013, 0.208, 0.037, 0.015, 
0.006, 0.186, 0.044, 0.016, 0.023, 0.251, 0.044, 0.02, 0.01, 
0.268, 0.04, 0.007, 0.007, 0.208, 0.062, 0.027, 0.036, 0.272, 
0.054, 0.006, 0.01, 0.274, 0.05, 0.011, 0.006, 0.28, 0.039, 0.007, 
0.019, 1.93, 0.345, 0.087, 0.094, 2.007, 0.341, 0.064, 0.061, 
1.733, 0.39, 0.131, 0.201, 0.094, 0.01, 0.004, 0, 0.096, 0.014, 
0, 0.001, 0.081, 0.016, 0.002, 0.016, 0.062, 0.007, 0.011, 0.001, 
0.07, 0.003, 0.005, 0.002, 0.043, 0.033, 0, 0.007, 0.081, 0.039, 
0.007, 0, 0.085, 0.033, 0.008, 0, 0.086, 0.023, 0.007, 0.007, 
0.083, 0.015, 0, 0, 0.09, 0.009, 0, 0, 0.049, 0.052, 0, 0.025, 
2.779, 0.203, 0.098, 0.016, 2.801, 0.242, 0.135, 0.01, 2.12, 
0.466, 0.177, 0.121, 1, 1, 1, 1, 2, 2, 2, 2, 3, 3, 3, 3, 0, 1, 
2, 3, 0, 1, 2, 3, 0, 1, 2, 3), .Dim = c(12L, 11L), .Dimnames = list(
    c("0", "1", "2", "3", "0", "1", "2", "3", "0", "1", "2", 
    "3"), c("age_77", "age_78", "age_79", "age_80", "age_81", 
    "age_82", "age_83", "age_84", "age_85", "item", "k")))

Basically I have different ages, for which 3 items have been reported with four response categories each. I would like to obtain a long-shaped object with colnames = age, item, k, proportion, like this:

  structure(c(77, 77, 77, 77, 77, 77, 77, 77, 77, 77, 77, 77, 78,
78, 78, 78, 78, 78, 78, 78, 78, 78, 78, 78, 1, 1, 1, 1, 2, 2, 
2, 2, 3, 3, 3, 3, 1, 1, 1, 1, 2, 2, 2, 2, 3, 3, 3, 3, 0, 1, 2, 
3, 0, 1, 2, 3, 0, 1, 2, 3, 0, 1, 2, 3, 0, 1, 2, 3, 0, 1, 2, 3, 
0.204, 0.036, 0.015, 0.013, 0.208, 0.037, 0.015, 0.006, 0.186, 
0.044, 0.016, 0.023, 0.251, 0.044, 0.02, 0.01, 0.268, 0.04, 0.007, 
0.007, 0.208, 0.062, 0.027, 0.036), .Dim = c(24L, 4L), .Dimnames = list(
    c("0", "1", "2", "3", "0", "1", "2", "3", "0", "1", "2", 
    "3", "0", "1", "2", "3", "0", "1", "2", "3", "0", "1", "2", 
    "3"), c("age", "item", "k", "proportion")))

An example I tried:

reshape(as.data.frame(d), varying =1:9, sep = "_", direction = "long",
                        times = "k", idvar = "item")
Error in `row.names<-.data.frame`(`*tmp*`, value = paste(ids, times[i],  : 

duplicate 'row.names' are not allowed

Any clue where's my mistake? Thanks a lot beforehand!

Upvotes: 0

Views: 177

Answers (1)

Uwe
Uwe

Reputation: 42544

The object d as provided by the OP is not a data.frame but a matrix which is causing the error:

str(d)
 num [1:12, 1:11] 0.204 0.036 0.015 0.013 0.208 0.037 0.015 0.006 0.186 0.044 ...
 - attr(*, "dimnames")=List of 2
  ..$ : chr [1:12] "0" "1" "2" "3" ...
  ..$ : chr [1:11] "age_77" "age_78" "age_79" "age_80" ...

In addition, the row numbers are not unique which causes an error as well when coercing d to data.frame.

With data.table, d can be coerced to a data.table object and reshaped from wide to long format using melt(). Finally, age is extracted from the column names and stored as integer values as requested by the OP.

library(data.table)
melt(as.data.table(d), measure.vars = patterns("^age_"),
     variable.name = "age", value.name = "proportion")[
  , age := as.integer(stringr::str_replace(age, "age_", ""))][]
     item k age proportion
  1:    1 0  77      0.204
  2:    1 1  77      0.036
  3:    1 2  77      0.015
  4:    1 3  77      0.013
  5:    2 0  77      0.208
 ---                      
104:    2 3  85      0.010
105:    3 0  85      2.120
106:    3 1  85      0.466
107:    3 2  85      0.177
108:    3 3  85      0.121

Upvotes: 2

Related Questions