Reputation: 133

Counting number of new doctors in each quarter for each person

I have a quarterly dataset (over three years) at the person level with doctor name and quarter. Most people have multiple doctors each quarter. I would like to create a new variable that keeps track of the new doctors in each quarter (starting at the 5th quarter) where a new doctor is defined as not having appeared in the previous 4 quarters. So if this is what the input data looks like for one person:

| ID | Quarter | Doctor_Name |
|----|---------|-------------|
| 1  |    1    | Dr. Smith   |
| 1  |    1    | Dr. Brown   |
| 1  |    1    | Dr. Smith   |
| 1  |    2    | Dr. Patel   |
| 1  |    2    | Dr. Garcia  |
| 1  |    2    | Dr. Garcia  |
| 1  |    3    | Dr. Kim     |
| 1  |    3    | Dr. Brown   |
| 1  |    3    | Dr. Patel   |
| 1  |    4    | Dr. Patel   |
| 1  |    4    | Dr. Craig   |
| 1  |    4    | Dr. Craig   |
| 1  |    5    | Dr. Brown   |
| 1  |    5    | Dr. Lee     |
| 1  |    5    | Dr. Kim     |
| 1  |    6    | Dr. Patel   |
| 1  |    6    | Dr. Smith   |
| 1  |    6    | Dr. Smith   |
| 1  |    7    | Dr. Garcia  |
| 1  |    7    | Dr. Smith   |
| 1  |    7    | Dr. Kim     |
| 1  |    8    | Dr. Lee     |
| 1  |    8    | Dr. Brown   |
| 1  |    8    | Dr. Smith   |

This is what the output should look like:

| ID | Quarter | Doctor_Name | New_Doctor    |
|----|---------|-------------|---------------|
| 1  |    1    | Dr. Smith   |               |
| 1  |    1    | Dr. Brown   |               |
| 1  |    1    | Dr. Smith   |               |
| 1  |    2    | Dr. Patel   |               |
| 1  |    2    | Dr. Garcia  |               |
| 1  |    2    | Dr. Garcia  |               |
| 1  |    3    | Dr. Kim     |               |
| 1  |    3    | Dr. Brown   |               |
| 1  |    3    | Dr. Patel   |               |
| 1  |    4    | Dr. Patel   |               |
| 1  |    4    | Dr. Craig   |               |
| 1  |    4    | Dr. Craig   |               |
| 1  |    5    | Dr. Brown   |       0       |
| 1  |    5    | Dr. Lee     |       1       |
| 1  |    5    | Dr. Kim     |       0       |
| 1  |    6    | Dr. Patel   |       0       |
| 1  |    6    | Dr. Smith   |       1       |
| 1  |    6    | Dr. Smith   |       1       |
| 1  |    7    | Dr. Garcia  |       1       |
| 1  |    7    | Dr. Smith   |       0       |
| 1  |    7    | Dr. Kim     |       0       |
| 1  |    8    | Dr. Lee     |       0       |
| 1  |    8    | Dr. Brown   |       0       |
| 1  |    8    | Dr. Smith   |       0       |

How can I create this new variable assuming that the dataset has multiple IDs? Thanks!

Upvotes: -1

Answers (1)

Nick Cox

Reputation: 37278

Please read the Stata tag wiki. It is a good idea to use dataex for data examples and at least to try some code. (Those are probably why you got down-voted.)

Here is some technique.

* Example generated by -dataex-. For more info, type help dataex
clear
input byte(id quarter) str13 doctor_name
1 1 "Dr. Smith" 
1 1 "Dr. Brown" 
1 1 "Dr. Smith" 
1 2 "Dr. Patel" 
1 2 "Dr. Garcia"
1 2 "Dr. Garcia"
1 3 "Dr. Kim"   
1 3 "Dr. Brown" 
1 3 "Dr. Patel" 
1 4 "Dr. Patel" 
1 4 "Dr. Craig" 
1 4 "Dr. Craig" 
1 5 "Dr. Brown" 
1 5 "Dr. Lee"   
1 5 "Dr. Kim"   
1 6 "Dr. Patel" 
1 6 "Dr. Smith" 
1 6 "Dr. Smith" 
1 7 "Dr. Garcia"
1 7 "Dr. Smith" 
1 7 "Dr. Kim"   
1 8 "Dr. Lee"   
1 8 "Dr. Brown" 
1 8 "Dr. Smith" 
end

bysort id doctor_name (quarter) : gen new1 = _n == 1 
by id doctor_name : replace new1 = 1 if new1[_n-1] == 1 & quarter == quarter[_n-1]

gen new2 = new1 if quarter > 4 

sort id quarter doctor_name 

list, sepby(id quarter)

    +-----------------------------------------+
     | id   quarter   doctor_n~e   new1   new2 |
     |-----------------------------------------|
  1. |  1         1    Dr. Brown      1      . |
  2. |  1         1    Dr. Smith      1      . |
  3. |  1         1    Dr. Smith      1      . |
     |-----------------------------------------|
  4. |  1         2   Dr. Garcia      1      . |
  5. |  1         2   Dr. Garcia      1      . |
  6. |  1         2    Dr. Patel      1      . |
     |-----------------------------------------|
  7. |  1         3    Dr. Brown      0      . |
  8. |  1         3      Dr. Kim      1      . |
  9. |  1         3    Dr. Patel      0      . |
     |-----------------------------------------|
 10. |  1         4    Dr. Craig      1      . |
 11. |  1         4    Dr. Craig      1      . |
 12. |  1         4    Dr. Patel      0      . |
     |-----------------------------------------|
 13. |  1         5    Dr. Brown      0      0 |
 14. |  1         5      Dr. Kim      0      0 |
 15. |  1         5      Dr. Lee      1      1 |
     |-----------------------------------------|
 16. |  1         6    Dr. Patel      0      0 |
 17. |  1         6    Dr. Smith      0      0 |
 18. |  1         6    Dr. Smith      0      0 |
     |-----------------------------------------|
 19. |  1         7   Dr. Garcia      0      0 |
 20. |  1         7      Dr. Kim      0      0 |
 21. |  1         7    Dr. Smith      0      0 |
     |-----------------------------------------|
 22. |  1         8    Dr. Brown      0      0 |
 23. |  1         8      Dr. Lee      0      0 |
 24. |  1         8    Dr. Smith      0      0 |
     +-----------------------------------------+

EDIT

Your desired output is just a (0, 1) variable, but the question hints at counting new doctors each quarter. To do that, each doctor must be counted just once the first time seen.

* Example generated by -dataex-. For more info, type help dataex
clear
input byte(id quarter) str13 doctor_name
1 1 "Dr. Smith" 
1 1 "Dr. Brown" 
1 1 "Dr. Smith" 
1 2 "Dr. Patel" 
1 2 "Dr. Garcia"
1 2 "Dr. Garcia"
1 3 "Dr. Kim"   
1 3 "Dr. Brown" 
1 3 "Dr. Patel" 
1 4 "Dr. Patel" 
1 4 "Dr. Craig" 
1 4 "Dr. Craig" 
1 5 "Dr. Brown" 
1 5 "Dr. Lee"   
1 5 "Dr. Kim"   
1 6 "Dr. Patel" 
1 6 "Dr. Smith" 
1 6 "Dr. Smith" 
1 7 "Dr. Garcia"
1 7 "Dr. Smith" 
1 7 "Dr. Kim"   
1 8 "Dr. Lee"   
1 8 "Dr. Brown" 
1 8 "Dr. Smith" 
end

bysort id doctor_name (quarter) : gen new3 = _n == 1 

egen new_total = total(new3), by(id quarter)

tabdisp quarter id, c(new_total)

----------------
          |  id 
  quarter |    1
----------+-----
        1 |    2
        2 |    2
        3 |    1
        4 |    1
        5 |    1
        6 |    0
        7 |    0
        8 |    0
----------------

EDIT 2

For your problem (thanks for comments) it seems simplest just to keep track of the previous quarter the doctor appeared.

* Example generated by -dataex-. For more info, type help dataex
clear
input byte(id quarter) str13 doctor_name
1 1 "Dr. Smith" 
1 1 "Dr. Brown" 
1 1 "Dr. Smith" 
1 2 "Dr. Patel" 
1 2 "Dr. Garcia"
1 2 "Dr. Garcia"
1 3 "Dr. Kim"   
1 3 "Dr. Brown" 
1 3 "Dr. Patel" 
1 4 "Dr. Patel" 
1 4 "Dr. Craig" 
1 4 "Dr. Craig" 
1 5 "Dr. Brown" 
1 5 "Dr. Lee"   
1 5 "Dr. Kim"   
1 6 "Dr. Patel" 
1 6 "Dr. Smith" 
1 6 "Dr. Smith" 
1 7 "Dr. Garcia"
1 7 "Dr. Smith" 
1 7 "Dr. Kim"   
1 8 "Dr. Lee"   
1 8 "Dr. Brown" 
1 8 "Dr. Smith" 
end

bysort id doctor_name (quarter) : gen prev = quarter[_n-1] if quarter != quarter[_n-1]
by id doctor_name : replace prev = prev[_n-1] if missing(prev)

list, sepby(id doctor_name)

gen new = quarter > prev + 4 

sort id quarter doctor_name 

list , sepby(id quarter)

Listings

. list, sepby(id doctor_name)

     +----------------------------------+
     | id   quarter   doctor_n~e   prev |
     |----------------------------------|
  1. |  1         1    Dr. Brown      . |
  2. |  1         3    Dr. Brown      1 |
  3. |  1         5    Dr. Brown      3 |
  4. |  1         8    Dr. Brown      5 |
     |----------------------------------|
  5. |  1         4    Dr. Craig      . |
  6. |  1         4    Dr. Craig      . |
     |----------------------------------|
  7. |  1         2   Dr. Garcia      . |
  8. |  1         2   Dr. Garcia      . |
  9. |  1         7   Dr. Garcia      2 |
     |----------------------------------|
 10. |  1         3      Dr. Kim      . |
 11. |  1         5      Dr. Kim      3 |
 12. |  1         7      Dr. Kim      5 |
     |----------------------------------|
 13. |  1         5      Dr. Lee      . |
 14. |  1         8      Dr. Lee      5 |
     |----------------------------------|
 15. |  1         2    Dr. Patel      . |
 16. |  1         3    Dr. Patel      2 |
 17. |  1         4    Dr. Patel      3 |
 18. |  1         6    Dr. Patel      4 |
     |----------------------------------|
 19. |  1         1    Dr. Smith      . |
 20. |  1         1    Dr. Smith      . |
 21. |  1         6    Dr. Smith      1 |
 22. |  1         6    Dr. Smith      1 |
 23. |  1         7    Dr. Smith      6 |
 24. |  1         8    Dr. Smith      7 |
     +----------------------------------+

. 
. gen new = quarter > prev + 4 

. 
. sort id quarter doctor_name 

. 
. list , sepby(id quarter)

     +----------------------------------------+
     | id   quarter   doctor_n~e   prev   new |
     |----------------------------------------|
  1. |  1         1    Dr. Brown      .     0 |
  2. |  1         1    Dr. Smith      .     0 |
  3. |  1         1    Dr. Smith      .     0 |
     |----------------------------------------|
  4. |  1         2   Dr. Garcia      .     0 |
  5. |  1         2   Dr. Garcia      .     0 |
  6. |  1         2    Dr. Patel      .     0 |
     |----------------------------------------|
  7. |  1         3    Dr. Brown      1     0 |
  8. |  1         3      Dr. Kim      .     0 |
  9. |  1         3    Dr. Patel      2     0 |
     |----------------------------------------|
 10. |  1         4    Dr. Craig      .     0 |
 11. |  1         4    Dr. Craig      .     0 |
 12. |  1         4    Dr. Patel      3     0 |
     |----------------------------------------|
 13. |  1         5    Dr. Brown      3     0 |
 14. |  1         5      Dr. Kim      3     0 |
 15. |  1         5      Dr. Lee      .     0 |
     |----------------------------------------|
 16. |  1         6    Dr. Patel      4     0 |
 17. |  1         6    Dr. Smith      1     1 |
 18. |  1         6    Dr. Smith      1     1 |
     |----------------------------------------|
 19. |  1         7   Dr. Garcia      2     1 |
 20. |  1         7      Dr. Kim      5     0 |
 21. |  1         7    Dr. Smith      6     0 |
     |----------------------------------------|
 22. |  1         8    Dr. Brown      5     0 |
 23. |  1         8      Dr. Lee      5     0 |
 24. |  1         8    Dr. Smith      7     0 |
     +----------------------------------------+

Upvotes: 0

Counting number of new doctors in each quarter for each person

Answers (1)

Related Questions