Tran Tam
Tran Tam

Reputation: 699

Stata sum by group

I got data like this:

tinh   huyen   xa   diaban   hoso   fee1   fee2   income
1      1       4    12        13     100    200
1      1       4    12        13     120    150
1      1       5    11        14     130    140
1      1       5    11        14     100    100
1      1       6    14        20     130    140

So I'd like to sum fee1 and fee2 by (tinh, huyen, xa, diaban, hoso) to income column.

Then result supposes like:

tinh   huyen   xa   diaban   hoso   fee1   fee2   income
1      1       4    12        13     100    200   570(100 + 200 + 120 + 150)
1      1       4    12        13     120    150   570(100 + 200 + 120 + 150)
1      1       5    11        14     130    140   470(130 + 140 + 100 + 100)
1      1       5    11        14     100    100   470(130 + 140 + 100 + 100)
1      1       6    14        20     130    140   270(130 + 140)

Please help me for this expression. Any solution will be appreciated. Thanks.

Upvotes: 1

Views: 25961

Answers (1)

Brendan
Brendan

Reputation: 4011

I'm sure someone else will have a cleverer solution, but you can combine egen's rowtotal and total functions to achieve this.

clear
input tinh   huyen   xa   diaban   hoso   fee1   fee2  
1      1       4    12        13     100    200
1      1       4    12        13     120    150
1      1       5    11        14     130    140
1      1       5    11        14     100    100
1      1       6    14        20     130    140
end

egen sum1 = rowtotal(fee1 fee2)
egen sum2 = total(sum1), by(tinh huyen xa diaban)

li

yielding

     +---------------------------------------------------------------+
     | tinh   huyen   xa   diaban   hoso   fee1   fee2   sum1   sum2 |
     |---------------------------------------------------------------|
  1. |    1       1    4       12     13    100    200    300    570 |
  2. |    1       1    4       12     13    120    150    270    570 |
  3. |    1       1    5       11     14    130    140    270    470 |
  4. |    1       1    5       11     14    100    100    200    470 |
  5. |    1       1    6       14     20    130    140    270    270 |
     +---------------------------------------------------------------+

Note that (per the comments) you can use egen income = total(fee1 + fee2), by(tinh huyen xa diaban). If you have missing values for fee1 or fee2 you'll have to account for those in the fee1 + fee2 expression if you use this syntax.

Upvotes: 3

Related Questions