MryElln
MryElln

Reputation: 41

How to find the mean of a continuous variable for each categorical variable

I'd like to plot continuous BMI on the y-axis and the categorical variable for Family Income on the x-axis and I'd like the graph to plot the mean BMI for each category. However, I am not sure how to find the mean BMI for each factor of Family Income.

Dataset nh  (5994 total IDs with Observations) (Parts of the 2009-2010 NHANES Dataset)
> dput(head(nh))
structure(list(SeqN = c(51624L, 51628L, 51629L, 51630L, 51633L, 51635L), 
Gender = c(1L, 2L, 1L, 2L, 1L, 1L), Age = c(34L, 60L, 26L, 49L, 80L, 80L), 
Ethnicity = c(3L, 4L, 1L, 3L, 3L, 3L), FamSize = c(4L, 2L, 5L, 3L, 2L, 1L),
RatioIncomePoverty = c(1.36, 0.69, 1.01, 1.91, 1.27, 1.69), 
MECWgt2 = c(81528.77201, 21000.33872, 22633.58187, 74112.48684, 12381.11532, 22502.50666),
BMI = c(32.22, 42.39, 32.61, 30.57, 26.04, 27.62), 
LengthUS = c(NA_integer_, NA_integer_, NA_integer_, NA_integer_, NA_integer_,NA_integer_),
Education = c(3L, 3L, 2L, 4L, 4L, 2L), LocationBorn = c(1L, 1L, 1L, 1L, 1L, 1L), 
FamIncome = c(6L, 3L, 6L, 7L, 4L, 4L)), .Names = c("SeqN", 
"Gender", "Age", "Ethnicity", "FamSize", "RatioIncomePoverty", 
"MECWgt2", "BMI", "LengthUS", "Education", "LocationBorn", "FamIncome"), 
row.names = c(NA, 6L), class = "data.frame")

faminc <- as.character(nhanes$FamIncome)
faminc

Any suggestions as to how to model the data to achieve this goal would be appreciated.

Upvotes: 0

Views: 3480

Answers (2)

Matthew Lundberg
Matthew Lundberg

Reputation: 42629

Here's a base solution using aggregate:

a <- aggregate(BMI ~ FamIncome, data=nh, FUN=mean)
barplot(a$BMI, names.arg=a$FamIncome)

enter image description here

Upvotes: 2

Metrics
Metrics

Reputation: 15458

This may work:

    library(plyr)
    nhh<-ddply(nh,.(famIncome), summarise, mean.bmi=mean(bmi)) # find mean bmi
    with(nhh, plot(famIncome,mean.bmi)) # simple plot

Upvotes: 2

Related Questions