ego_
ego_

Reputation: 1491

Alter output of ddply

Is it possible to alter the output of ddply? I wondered if was possible to present the unique results for a subset on ONE row instead of giving each result a new row. E.g.

ID   Season  Year
5074 Summer 2008
5074 Summer 2009
5074 Winter 2008
5074 Winter 2009
5074 Winter 2010

Into...

ID   Season  Year  
5074 Summer  2008,2009  
5074 Winter  2008,2009,2010  

I often use ddply to manually diagnose the results of for-loops etc, and presenting the results like this would reduce the length of the output and making the check go much faster.

Cheers!

Upvotes: 3

Views: 180

Answers (3)

mnel
mnel

Reputation: 115392

This is a perfect fit for the new nice printing of lists in data.table version 1.8.2

library(data.table)
DT <- as.data.table(dd)
DT[,list(Year = list(Year)), by = list(ID, Season)]
##     ID Season           Year
## 1: 5074 Summer      2008,2009
## 2: 5074 Winter 2008,2009,2010

The good thing about the results in this format is the fact that it is just the printing that is affected, you can still access the results without any string splitting

DT[(ID==5074)&(Season == 'Summer'), Year]
## [1] 2008 2009
DT[(ID==5074)&(Season == 'Winter'), Year]
## [1] 2008 2009 2010

Upvotes: 2

csgillespie
csgillespie

Reputation: 60462

First load in the data

dd = read.table(textConnection("ID   Season  Year
5074 Summer 2008
5074 Summer 2009
5074 Winter 2008
5074 Winter 2009
5074 Winter 2010"), header=TRUE)

then just use ddply as normal, splitting by ID and Season

ddply(dd, .(ID, Season), summarise, Year=paste(Year, collapse=","))

We use the collapse argument in paste to return a single character. Since you want to use this as a check, it might be worth using sort on Year, i.e.

paste(sort(Year), collapse=",")

Upvotes: 7

Sven Hohenstein
Sven Hohenstein

Reputation: 81683

dat <- read.table(text="ID Season Year
 5074 Summer 2008
 5074 Summer 2009
 5074 Winter 2008
 5074 Winter 2009
 5074 Winter 2010", header = TRUE)

The output can be transformed using aggregate:

aggregate(Year ~ ID + Season, data = dat, paste)
#    ID Season             Year
#1 5074 Summer       2008, 2009
#2 5074 Winter 2008, 2009, 2010

Upvotes: 3

Related Questions