Reputation: 1
I would like to generate a summary(mean) 0f a subgroup of a dataset in R using the tapply function. The dataset= VehicleData". I would like to calculate the mean for the response variable,"HWY_MPG" after the data has been grouped into 2 factors; "Type" and "Drive". There are some missing data in the dataset, hence I used na.rm=T as part of my argument. However, after I applying the function, Nas were stil returned. Please how do I go about this?
tapply(VehicleData$HWY_MPG,list(VehicleData$Type,VehicleData$Drive),mean,na.rm=T)
4wd Front Rear
Car 25.17382 30.68226 24.37903
Minivan 23.26471 24.28902 NA
Pickup 18.82911 NA 21.21270
St.Wagon 26.46635 29.86416 25.61538
SUV 20.60339 26.55390 20.51227
Two_Seater 18.55882 50.26316 24.56571
Van 17.66667 NA 18.38991
Upvotes: 0
Views: 672
Reputation: 1
tapply works best with na.rm=TRUE
. na.rm=T
doesn't work
try the following..
tapply(VehicleData$HWY_MPG,list(VehicleData$Type,VehicleData$Drive),mean,na.rm=TRUE)
Upvotes: 0