Reputation: 529
Say I want to create some scalar value like median price/median income
mean downpayment/house price
. I know I can first use su
command and then extract denominators and numerators separately from the r-class and then create the desired scalars.
However, when I have a dozen such scalars and by different household type, such approach is tedious in practice. So I wonder if there's any way to accomplish above work more efficiently? If I can create a table containing such scalars within Stata, it's even more amusing.
Upvotes: 1
Views: 146
Reputation: 37233
Executive summary: So, don't use scalars; use variables instead.
There is a prior statistical issue, which is that (say) summary(y) / summary(x) is not necessarily equal to summary(y/x); in general, the two will differ. It seems to me that the latter usually makes more sense, but set that aside otherwise.
Here is one not too crazy example. How much do you have to pay (in US dollars circa 1978) per pound weight (physicists: mass, really) for various cars in the Stata auto dataset?
. sysuse auto
(1978 Automobile Data)
. gen pricePERlb = price/weight
. egen mean = mean(pricePERlb), by(rep78)
. tabstat mean, s(n mean) by(rep78)
Summary for variables: mean
by categories of: rep78 (Repair Record 1978)
rep78 | N mean
---------+--------------------
1 | 2 1.479266
2 | 8 1.731407
3 | 30 1.895855
4 | 18 2.25233
5 | 11 2.472519
---------+--------------------
Total | 69 2.049639
------------------------------
Now here's a small twist. The generate
wasn't needed here. We could have gone
egen mean = mean(price/weight), by(rep78)
.
The tools are all trivial: generate
to create new variables, egen
to create new variables that here can be summary statistics calculated for groups, and tabstat
, among many other tabulation commands, to show results. Since the statistics here are by construction constant within groups, asking for their mean is just one of several ways of getting at them. Similarly, graph dot
, graph hbar
, etc. are immediate for display.
Upvotes: 2