Brandon Bertelsen
Brandon Bertelsen

Reputation: 44658

reshape: cast oddity

Either it's late, or I've found a bug, or cast doesn't like colnames with "." in them. This all happens inside a function, but it "doesn't work" outside of a function as much as it doesn't work inside of it.

x <- structure(list(df.q6 = structure(c(1L, 1L, 1L, 11L, 11L, 9L, 
4L, 11L, 1L, 1L, 2L, 2L, 11L, 5L, 4L, 9L, 4L, 4L, 1L, 9L, 4L, 
10L, 1L, 11L, 9L), .Label = c("a", "b", "c", "d", "e", "f", "g", 
"h", "i", "j", "k"), class = "factor"), df.s5 = structure(c(4L, 
4L, 1L, 2L, 4L, 4L, 4L, 3L, 4L, 1L, 2L, 1L, 2L, 4L, 1L, 3L, 4L, 
2L, 2L, 4L, 4L, 4L, 2L, 2L, 1L), .Label = c("a", "b", "c", "d", 
"e"), class = "factor")), .Names = c("df.q6", "df.s5"), row.names = c(NA, 
25L), class = "data.frame")

cast(x, df.q6 + df.s5 ~., length)

No worky.

However, if:

colnames(x) <- c("variable", "value")
cast(x, variable + value ~., length)

Works like a charm.

Upvotes: 2

Views: 1570

Answers (3)

Spacedman
Spacedman

Reputation: 94222

Nothing to do with the dots in the colnames (easily shown!).

If your dataframe doesnt have a column called 'value' then cast() guesses what column is the value - in this case it guesses 'df.s5' as it is the last column. This is what you get when you melt() data. It then renames that column to 'value' before calling reshape1. Now the column 'df.s5' is no more, yet it's there on the left of your formula. Uh oh.

You are using the value in the formula, which is an odd thing to do. None of the cast examples do that. What are you trying to do here?

You could add an ad-hoc column as a dummy value:

> cast(cbind(x,1), df.q6+s5~., length)

Using 1 as value column. Use the value argument to cast to override this choice

   df.q6 s5 (all)
1      a  a     2
2      a  b     2
3      a  d     3
4      b  a     1
5      b  b     1
[etc]

But I suspect there's a better way to get the number of repeated observations (rows) in a data frame - which is your real question!

Upvotes: 3

Jay
Jay

Reputation: 3017

For me I use a similar solution to what Spacedman points out.

#take your data.frame x with it's two columns

#add a column
x$value <- 1

#apply your cast verbatim
cast(x, df.q6 + df.s5 ~., length)

   df.q6 df.s5 (all)
1      a     a     2
2      a     b     2
3      a     d     3
4      b     a     1
5      b     b     1
6      d     a     1
7      d     b     1
8      d     d     3
9      e     d     1
10     i     a     1
11     i     c     1
12     i     d     2
13     j     d     1
14     k     b     3
15     k     c     1
16     k     d     1

Hopefully that helps!

Jay

Upvotes: 4

kohske
kohske

Reputation: 66862

if you are looking for an easy solution, dcast in reshape2 package can help you:

library(reshape2)
dcast(x, df.q6 + df.s5 ~., length)

Upvotes: 2

Related Questions