Reputation: 1
I have here a Steam Dataset which includes individual steam user and their playtimes(overall) and the games they played. I further divided the player in hardcore(=1) and casual player (=0). Overall I want to test how various factors have influence on the overall playtime of the players, but now I want to build 2 regressions, one for hardcore players and one for casual players(because I think that the effect of every factor can differ between those two). But in order to do that, I need the sum of the overall playtime from the 2 subgroups. I tried egen playtime_type = sum(playtime_sum), by (hightype)
, but the outcome just doesn't make sense. How can I aggregate the sum of playtime only for each subgroup?
Here is a example from the dataset
steamid playtime_sum hightype
76561197960265729 0 0
76561197960265730 45 0
76561197960265730 45 0
76561197960265730 45 0
76561197960265733 1710 0
76561197960265733 1710 0
76561197960265733 1710 0
76561197960265733 1710 0
76561197960265733 1710 0
76561197960265738 11 0
76561197960265738 11 0
76561197960265738 11 0
Upvotes: 0
Views: 1887
Reputation: 3255
What makes sense and what does not make sense is highly subjective and it is always better if you explain what the output was and what you had expected instead.
My guess is that you want to use total()
instead of sum()
.
egen playtime_type = total(playtime_sum), by(hightype)
Upvotes: 1