maldini425
maldini425

Reputation: 317

Mean for numeric variable Across sub-groups within a string variable

I have two variables in the following formats:

desc waiting_days

              storage   display    value
variable name   type    format     label      variable label
--------------------------------------------------------------------------------------------------------------------------------------
waiting_days    float   %9.0g     

And

 desc category

              storage   display    value
variable name   type    format     label      variable label
--------------------------------------------------------------------------------------------------------------------------------------
category~y str11   %11s                  Categories

The waiting_days is numeric, while the category variable is string containing 11 unique sub-groups with values from Category 1, Category 2, etc..

I am trying to create the average_waiting_days for each category under the category string variable.

waiting_days    category       average_waiting_days 
     
319             category 2         100 days
8763            category 6         85 days
7455            category 3         300 days
464             category 6         85 days
900             category 3         300 days
500             category 3         300 days

Upvotes: 0

Views: 350

Answers (1)

Wouter
Wouter

Reputation: 3261

Here are some options to calculate means by category.

// Create a new variable with means by category
egen avg_waiting_days = mean(waiting_days), by(category)

// Create a table of means
table category, c(mean waiting_days)

// Calculate means with standard errors
encode category, gen(category_num)
mean waiting_days, over(category_num)

Upvotes: 2

Related Questions