Matt Weller
Matt Weller

Reputation: 2754

Use of cast in the reshape package.

I am trying to use the "value=" parameter. It does not do what I want. The example below replicates the problem in a more simple format. I'm wondering what the "value=" parameter is for.

I have melted a data frame into long format with all my factors and a single numeric variable which takes values 0,1,2,3,4. I then created a second value column with 0,1 to refine the original value column. Cast works a treat when I try to aggregate as long as it is the ORIGINAL value column.

D = data.frame(id = 1:10,
           grp = rep(c("A","B"),5),
           variable = "var",
           value = rnorm(10,0,1),
           value2 = rnorm(10,10,2))

cast(D, grp~., mean)                    #works fine
cast(D, grp~., value = "value2", mean)  #does not work

If this is not possible then I will have to manipulate my data.

Upvotes: 2

Views: 449

Answers (1)

A5C1D2H2I1M1N2O1R2T1
A5C1D2H2I1M1N2O1R2T1

Reputation: 193517

I don't know for sure, but I think it's because of the following code in cast:

if (any(names(data) == value)) 
names(data)[names(data) == value] <- "value"

Try not using the word "value" in your variable names, for example names(D)[4:5] = c("one", "two") and then use cast(D, grp ~ ., mean, value="one") and cast(D, grp ~ ., mean, value="two") to get the results you're looking for.

Update

Technically, your data are not fully "molten". See the example below for how you should correctly approach this. It basically involves "melting" your data once again and using subset. (I've changed "value" and "value2" to values that are easier to see what's going on.)

D = data.frame(id = 1:10,
               grp = rep(c("A","B"),5),
               variable = "var",
               value = rep(c(1, 2), 5),
               value2 = rep(c(3, 4), 5))
D2 = melt(D, id.vars=1:2, measure.vars=4:5)
cast(D2, grp ~ ., mean, subset=variable=="value")
#   grp (all)
# 1   A     1
# 2   B     2
cast(D2, grp ~ ., mean, subset=variable=="value2")
#   grp (all)
# 1   A     3
# 2   B     4

Update 2

It seems that any time there is a variable named value, that is always the one that cast uses, even if you specify another variable for the value= argument. The "strategy" section for the help file for guess_value (which cast makes use of) describes the following two steps:

  1. Is value or (all) column present? If so, use that
  2. Otherwise, guess that last column is the value column

But, in the few tests I've done, I don't see any way to specify a value= argument successfully without renaming the variables or re-melting the data.

Upvotes: 1

Related Questions