Exodia16
Exodia16

Reputation: 177

Percentage of categories both 100% in Stata?

I am trying to create a graph looking at the percentage in each age group infected with a disease, i.e. they have a mean egg count>0 in the variable eggs10 and then dividing this into those that are lightly infected or heavily infected.

I have variable for the mean egg count in their urine: eggs10

I have a variable for who I am looking at gender

I have a variable for intensity : light heavy

When I type in:

gr bar (count) eggs10, stack asyvars over(intense) by(gender) percent

it gives me 100% for both males and females as their prevalence of infection! How do I get percentage of those with eggs10 >0 as y axis?

Upvotes: 1

Views: 7290

Answers (1)

Nick Cox
Nick Cox

Reputation: 37208

The effect of by() is to treat genders separately. That is, Stata's view is that you asked for percentages to be calculated separately. You may need over(gender) here.

(UPDATE) The sample data are rather different from the original example.

. input str1 child  Meanegg str1 gender str5 intensity 

child    Meanegg     gender  intensity
1. a  0  M  None 
2. b  55 F  Heavy 
3. c  47 F  Light
4. end 

. encode gender, gen(Gender)
. encode intensity, gen(Intensity)

Consider catplot, which can download using

. ssc inst catplot 

Try something like

. catplot Intensity Gender, asyvars percent(Gender)  stack recast(bar)

(SECOND UPDATE) It is important to realise what graph bar (count) does. Check out these examples:

. sysuse auto
. graph bar (count) mpg
. graph bar (count) mpg , over(foreign)

Here is the graph from the second. (count) here counts the numbers of observations with non-missing values. This is explained in the help. However, that is rarely what people want here: more commonly, people want counts of the distinct categories of a variable. That can be done with graph bar, but it is easier with catplot (SSC).

To spell it out for the example graph: the graph tells you that there are 52 non-missing values of mpg for domestic cars and 22 for foreign cars. The graph says nothing about what the values of mpg actually are.

You could say: But the graph is showing you the frequencies of the distinct categories of foreign. Yes; but only in so far as there is a non-missing value of mpg for each non-missing value of foreign.

enter image description here

(THIRD UPDATE) (in response to comment August 26) Study the following:

. clear

. input var1 str3 var2

        var1       var2
       1. 44 "Yes"
       2. 36 "No"
       3. end

. graph bar (asis) var1, over(var2)

. graph bar (asis) var1, over(var2) percent

. graph bar (asis) var1, over(var2) percent asyvars bargap(20)

Upvotes: 2

Related Questions