Reputation: 289
I have obtained a list of projects that in total generate zero revenue (total revenue over a period of time)
tabstat revenue, by(project) stat(sum)
I have identified 261 projects (out of 1000s) that generate zero revenue for the whole period of time.
Now, want to look at the total value of a specific variable that can be tracked over multiple periods for each project in these zero-revenue-generating projects. I know that I can go after each campaign by typing
tabstat variable_of_interest if project==127, stat(sum)
Again, here project 127 generated zero revenue.
Is there a way to merge these two codes so that I can generate a table with the following logic
generate total sum of the variable_of_interest if project's stat(sum) was equal to zero
?
here is a data sample
project revenue var_of_intr
1 0 5
1 0 8
1 2 10
1 0 5
2 0 5
2 0 90
2 0 2
2 0 0
3 0 76
3 0 5
3 0 23
3 0 4
4 0 75
4 8 2
4 0 9
4 0 6
5 0 88
5 0 20
5 0 9
5 0 14
Since projects 1 and 4 generated revenue>0, the code should ignore then when summing up the variable of interest by campaign, thus, the table I am interested in should look like this
project var_of_intr
2 97
3 108
5 131
Upvotes: 1
Views: 51
Reputation: 11112
You can use collapse
:
clear
set more off
*----- example data -----
input ///
project revenue somevar
1 0 5
1 0 8
1 2 10
1 0 5
2 0 5
2 0 90
2 0 2
2 0 0
3 0 76
3 0 5
3 0 23
3 0 4
4 0 75
4 8 2
4 0 9
4 0 6
5 0 88
5 0 20
5 0 9
5 0 14
end
list
*----- what you want -----
collapse (sum) revenue somevar, by(project)
keep if revenue == 0
That will destroy the database, of course, but it might be useful anyway. You don't really specify if this approach is acceptable or not.
For a table, you can flag projects with revenue equal to zero and condition on that:
bysort project (revenue): gen revzero = revenue[_N] == 0
tabstat somevar if revzero, by(project) stat(sum)
If you have missing or negative revenues, modifications are required.
Upvotes: 2