Reputation: 317
I have two variables in the following formats:
desc waiting_days
storage display value
variable name type format label variable label
--------------------------------------------------------------------------------------------------------------------------------------
waiting_days float %9.0g
And
desc category
storage display value
variable name type format label variable label
--------------------------------------------------------------------------------------------------------------------------------------
category~y str11 %11s Categories
The waiting_days is numeric, while the category variable is string containing 11 unique sub-groups with values from Category 1, Category 2, etc..
I am trying to create the average_waiting_days for each category under the category string variable.
waiting_days category average_waiting_days
319 category 2 100 days
8763 category 6 85 days
7455 category 3 300 days
464 category 6 85 days
900 category 3 300 days
500 category 3 300 days
Upvotes: 0
Views: 350
Reputation: 3261
Here are some options to calculate means by category.
// Create a new variable with means by category
egen avg_waiting_days = mean(waiting_days), by(category)
// Create a table of means
table category, c(mean waiting_days)
// Calculate means with standard errors
encode category, gen(category_num)
mean waiting_days, over(category_num)
Upvotes: 2