user3821211
user3821211

Reputation: 95

variable lengths differ error in a aggragate

I have some data that I would like to summarize:

    studentid friend Gfriend
214  30401006      0       0
236  30401006      0       0
208  30401006      1       0
229  30401006      0       0
207  30401006      0       0
278  30401007      1       0
250  30401007      1       0
266  30401007      1       0
254  30401007      1       1
277  30401007      1       1
243  30401007      1       1

result should look something like this:

studentid friend Gfriend
30401006   1      0
30401007   6      3

When I try: agg=aggregate(c(friend)~studentid,data=df,FUN=sum) I get the required result (but only for the friend variable). But when I try: agg=aggregate(c(friend,Gfriend)~studentid,data=df,FUN=sum) I get:

Error in model.frame.default(formula = c(friend, Gfriend) ~ studentid, : variable lengths differ (found for 'studentid')

I checked the lengths of the variables ( length(var) ) and they are all the same, plus there are no NA's so I have no idea where this error is coming from.

Why is this happening?

Upvotes: 3

Views: 4334

Answers (2)

Latrunculia
Latrunculia

Reputation: 716

EDIT: added na.rm = T to address the comment about excluding NAs

Check out the "plyr" package.

library(plyr)

#split by "studentid" and sum all numeric colums 

ddply(df, .(studentid), numcolwise(sum, na.rm=T))

studentid friend Gfriend
1  30401006      1       0
2  30401007      6       3

Upvotes: 0

rmuc8
rmuc8

Reputation: 2989

you could also try "by"

 studentid < c(30401006,30401006,30401006,30401006,30401006,30401007,
 + 30401007,30401007,30401007,30401007,30401007)
 friend <- c(0,0,1,0,0,1,1,1,1,1,1)
 Gfriend <- c(0,0,0,0,0,0,0,0,1,1,1)
 df <- data.frame(studentid,friend,Gfriend)
 df

 > result <- by(df[c(2:3)], df$studentid, FUN=colSums)

 > result
 df$studentid: 30401006
 friend Gfriend 
 1       0 
 df$studentid: 30401007
 friend Gfriend 
 6       3 

Upvotes: 0

Related Questions