Reputation: 1
I have a variable age
, 13 variables x1
to x13
, and 802 observations in a Stata dataset. age
has values ranging 1 to 9. x1
to x13
have values ranging 1 to 13.
I want to know how to count the number of 1 .. 13 in x1
to x13
according to different values of age
. For example, for age
1, in x1
to x13
, count the number of 1,2,3,4,...13.
I first change x1
to x13
as a matrix by using
mkmat x1 x2 x3 x4 x5 x6 x7 x8 x9 x10 x11 x12 x13, matrix (a)
Then, I want to count using the following loop:
gen count = 0
quietly forval i = 1/802 {
quietly forval j = 1/13 {
replace count = count + inrange(a[r'i', x'j'], 0, 1), if age==1
}
}
I failed.
Upvotes: 0
Views: 590
Reputation: 37233
With Aspen's example, you could do this:
gen id = _n
reshape long x, i(id)
tab age x
Note that your sample code doesn't loop over different ages and there is an incorrect comma in the count
command. I won't try to fix the code, as there are many more direct methods, one of which is above. tabulate
has an option to save the table as a matrix.
Here is another solution closer to the original idea. Warning: code not tested.
matrix count = J(9, 13, 0)
forval i = 1/9 {
forval j = 1/13 {
forval J = 1/13 {
qui count if age == `i' & x`J' == `j'
matrix count[`i', `j'] = count[`i', `j'] + r(N)
}
}
}
Upvotes: 1
Reputation: 735
I am still somewhat uncertain as to what you like to achieve. But if I am understanding you correctly, here is one way to do it.
First, a simple data that has age
ranging from one to three, and four variables x1-x4
, each with values of integers ranging between 5 and 7.
clear
input age x1 x2 x3 x4
1 5 6 6 6
1 7 5 6 5
2 5 7 6 6
3 5 6 7 7
3 7 6 6 6
end
Then we create three count variables (n5, n6 and n7) that counts the number of 5s, 6s, and 7s for each subject across x1-x4
.
forval i=5/7 {
egen n`i'=anycount(x1 x2 x3 x4),v(`i')
}
Below is how the data looks like now. To explain, the first "1" under n5
indicates that there is only one "5" for the subject across x1-x4
.
+----------------------------------------+
| age x1 x2 x3 x4 n5 n6 n7 |
|----------------------------------------|
1. | 1 5 6 6 6 1 3 0 |
2. | 1 7 5 6 5 2 1 1 |
3. | 2 5 7 6 6 1 2 1 |
4. | 3 5 6 7 7 1 1 2 |
5. | 3 7 6 6 6 0 3 1 |
+----------------------------------------+
It sounds to me like your ultimate goal is to have sums calculated separately for each value in age. Assuming this is true, let's create a 3x3 matrix to store such results.
mat A=J(3,3,.) // age (1-3) and values (5-7)
mat rown A=age1 age2 age3
mat coln A=value5 value6 value7
forval i=5/7 {
forval j=1/3 {
qui su n`i' if age==`j'
loca k=`i'-4 // the first column for value5
mat A[`j',`k']=r(sum)
}
}
The matrix looks like this. To explain, the first "3" under value5
indicates that for all children of the age of 1, the value 5 appears a total of three times across x1-x4
A[3,3]
value5 value6 value7
age1 3 4 1
age2 1 2 1
age3 1 4 3
Upvotes: 1