Immo
Immo

Reputation: 111

In Stata, how to create groups such that each group has the same total of another variable?

I have a panel dataset and want to create groups from the data.

egen AUX = cut(variable), group(5)  

This will create groups with (almost) the same number of units in each. What I would like now is the have groups such that in every group, the total of another variable is the same. For example, I want to group households such that every bin has the same total income.

How to set up such a command?

Upvotes: 0

Views: 1135

Answers (1)

Nick Cox
Nick Cox

Reputation: 37278

No data example here, and it's not clear how the panel structure enters. For example, do you want to pool households for all years (?), or take years separately?

Either way, the technique is to split according to fractions of the cumulative sum. A detail is that identical values should be assigned to the same bin.

sysuse auto, clear
bysort foreign (price) : gen runningsum = sum(price)

* same values belong together 
bysort foreign price (runningsum) : replace runningsum = runningsum[_N] 
by foreign : gen quintile = ceil(5 * runningsum/runningsum[_N])
bysort foreign quintile : egen qtotal = total(price)
list  price qtotal quintile if foreign, sepby(quintile)


     +----------------------------+
     |  price   qtotal   quintile |
     |----------------------------|
 53. |  3,748    24231          1 |
 54. |  3,798    24231          1 |
 55. |  3,895    24231          1 |
 56. |  3,995    24231          1 |
 57. |  4,296    24231          1 |
 58. |  4,499    24231          1 |
     |----------------------------|
 59. |  4,589    31280          2 |
 60. |  4,697    31280          2 |
 61. |  5,079    31280          2 |
 62. |  5,397    31280          2 |
 63. |  5,719    31280          2 |
 64. |  5,799    31280          2 |
     |----------------------------|
 65. |  5,899    25273          3 |
 66. |  6,229    25273          3 |
 67. |  6,295    25273          3 |
 68. |  6,850    25273          3 |
     |----------------------------|
 69. |  7,140    24959          4 |
 70. |  8,129    24959          4 |
 71. |  9,690    24959          4 |
     |----------------------------|
 72. |  9,735    34720          5 |
 73. | 11,995    34720          5 |
 74. | 12,990    34720          5 |
     +----------------------------+

Upvotes: 1

Related Questions