user1378122
user1378122

Reputation: 105

simple data.frame reshape

I've just come back to R from a long hiatus writing and I'm having some real problems remembering how to reshape data. I know that what I want to do is easy, but for some reason I'm being dumb tonight and have confused myself with melt and reshape. If anyone could quickly point me in the right direction it would be hugely appreciated.

I have a dataframe as such:

person    week    year   
personA   6       1
personA   22      1
personA   41      1
personA   42      1
personA   1       2
personA   23      2
personB   8       2
personB   9       2
....
personN   x       y

I want to end up with a count of events by year and by person: (so that I can plot a quick line graph for each person over the years )

e.g.

person    year1    year2
personA   4        2
personB   0        2

Many thanks for reading.

Upvotes: 7

Views: 374

Answers (3)

IRTFM
IRTFM

Reputation: 263301

xtabs from base R works very well for this problem:

dat <- read.table(text="person    week    year   
personA   6       1
personA   22      1
personA   41      1
personA   42      1
personA   1       2
personA   23      2
personB   8       2
personB   9       2
", header=TRUE)
xtabs(~person+year, data=dat)
#-----------------
         year
person    1 2
  personA 4 2
  personB 0 2

You could pass its output to matplot since it returns a table/matrix object:

matplot( xtabs(~person+year, data=dat))

The output x-axis on this tiny example might not be what you want but with more years, there might be a more satisfactory default axis labeling. Or you could suppress the default x-axis labels with xaxt="n" and use axis to label as you wish:

matplot(  xtabs(~person+year, data=dat), xaxt="n", type="b")

Upvotes: 5

Ernest A
Ernest A

Reputation: 7839

In this case, you can simply use tapply:

> with(data, tapply(week, list(person=person, year=year), length))
         year
person     1 2
  personA  4 2
  personB NA 2

The result is a matrix. This solution produces NAs if there are empty cells.

Upvotes: 7

Chase
Chase

Reputation: 69151

I would probably use reshape2 package and the dcast function since it handles both the reshaping and aggregation in one step:

library(reshape2)
> dcast(person ~ year, value.var = "year", data = dat)
Aggregation function missing: defaulting to length
   person 1 2
1 personA 4 2
2 personB 0 2

Upvotes: 8

Related Questions