Rodrigo
Rodrigo

Reputation: 5129

How to aggregate in R ignoring some rows for some fields, and not ignoring them for others?

city qA qB qC
0001  1  1  5
0001  3  1  3
0002  2  0 NA
0002  2  0 NA
0002  4  1  1
0002  4  1  3

I'd like to agreggate this list by city, with the mean values for each other field. As you can see, question C is only answered if question B is 1. What I want as a result is:

city qA qB  qC
0001  2  1   4
0002  3 0.5  2

I tried removing the lines with qB==0, but this will change the mean for qA. Any ideas? Thanks in advance!

Upvotes: 0

Views: 962

Answers (2)

asb
asb

Reputation: 4432

It is quite simple actually:

aggregate(xx[-1], by=list(xx$city), FUN=mean, na.rm=TRUE)

Upvotes: 1

Hong Ooi
Hong Ooi

Reputation: 57686

Use aggregate, with both the arguments na.action=na.pass and na.rm=TRUE. The former tells aggregate not to remove rows where NAs are present; and the latter is the action that the aggregating function should take.

aggregate(cbind(qA, qB, qC) ~ city, df, mean, na.action=na.pass, na.rm=TRUE)

Upvotes: 3

Related Questions