a barking spider
a barking spider

Reputation: 633

How do I reshape these data in R?

So -- I'm working with a df that's got these groups of repeated observations indexed by an id, like so:

id | x1 | x2 | y1 | y2
1    a    b    c    2
1    a    b    d    3
1    a    b    e    4
2    ...
2    ...
...

i.e., all the variables within each group are identical, save for y1 and y2 (generally speaking, y2 'modifies' y1.) All these variables that I've listed here are factors. What I'd like to do is to turn each one of these groups into something that resembles the following:

id | x1 | x2 | y1' | y2' | y3' 
1    a    b    c-2   d-3   e-4
2    ...

where the y1's (y1-prime) are concatenations of adjacent values of y1 and y2, with a dash in between. However, the number of y1's differs from id-group to id-group, but I'd be happy with a very wide data frame that allows for these extras as a solution. Anyhow, I've (rather futilely, I must confess) tried melting and casting these data with reshape2, but at this point, I'm not sure whether I'm not going about this right, or that package just isn't a fit for what I'm trying to do here. Any advice would be appreciated -- thanks!

Upvotes: 2

Views: 181

Answers (3)

VitoshKa
VitoshKa

Reputation: 8533

Yes, this is what reshape package is for. First prepare your data:

foo <- transform(foo,
                 y = paste(y1,y2, sep = "-"),
                 ix = unlist(tapply(id, id, function(gr) 1:length(gr))))

Then proceed with your transform:

mfoo <- melt(foo, measure.vars = "y")
cast(mfoo, id + x1 + x2 ~ variable + ix)

Should give

  id x1 x2 y_1 y_2  y_3
1  1  a  b c-2 d-3  e-4
2  2  a  b f-2 h-4 <NA>

with the data set

foo <- read.table(textConnection("id  x1  x2  y1  y2
1    a    b    c    2
1    a    b    d    3
1    a    b    e    4
2    a    b    f    2
2    a    b    g    3"),header=TRUE)

[edit: it's reshape, with reshape2 you should use dcast instead of cast]

Upvotes: 0

Tyler Rinker
Tyler Rinker

Reputation: 110034

I saw Sacha's answer and thought I'd try extending it to longer data set. I don't think this will give you the results you want but am not sure. It's not entirely clear to me what you're trying to do. So this is my attempt to do what you're after but I'm not entirely sure what that is:

foo <- read.table(textConnection("id  x1  x2  y1  y2
1    a    b    c    2
1    a    b    d    3
1    a    b    e    4
2    a    b    f    2
2    a    b    g    3
2    a    b    h    4"),header=TRUE)


new <- transform(foo, time.var=paste(id, x1, x2, sep=""), 
    y1=paste(y1, y2, sep="-"))[, -5] 

new <- data.frame(unique(foo[, 1:3]), t(unstack(new[, 4:5])))
names(new)[4:6] <- paste("y", 1:3, sep="")
new

Though I think sacha's answer works the same as mine if you put id in with x1 and x2 (I'm guessing this may be more generalizable):

ddply(foo,.(id, x1,x2),with,{
        res <- data.frame(
          id = id[1],
          x1 = x1[1],
          x2 = x2[1])
        for (i in 1:length(y1))
        {
          res[[paste("y",i,sep="")]] <- paste(y1,y2,sep="-")[i]
        }
        return(res)
      }
    )

EDIT: This solution may be more generalizable:

new <- transform(foo, y=paste(y1, y2, sep="-"), stringsAsFactors=FALSE)
aggregate(y~id+x1+x2, new, c)

Upvotes: 1

Sacha Epskamp
Sacha Epskamp

Reputation: 47612

If I understand the question correctly, here is a way to do it with plyr:

foo <- read.table(textConnection("id  x1  x2  y1  y2
1    a    b    c    2
1    a    b    d    3
1    a    b    e    4"),header=TRUE)


library("plyr")

ddply(foo,.(x1,x2),with,{
        res <- data.frame(
          id = id[1],
          x1 = x1[1],
          x2 = x2[1])
        for (i in 1:length(y1))
        {
          res[[paste("y",i,sep="")]] <- paste(y1,y2,sep="-")[i]
        }
        return(res)
      }
    )

This returns:

  id x1 x2  y1  y2  y3
1  1  a  b c-2 d-3 e-4

Upvotes: 1

Related Questions