Reputation: 13
I'm attempting to create a new variable in my dataset that stores a number, which is derived from a computation on another number from the same observation.
* Here is what my dataset looks like:
SubjectID Score MyNewScore
1001 5442822 0
1002 6406134 0
1003 16 0
Now, the variable Score
is the sum of up to 23 distinct numbers (I'll call them "Responses"), ranging from 1 to 8,388,608.
/* Example of response values
1st response = 1
2nd response = 2
3rd response = 4
4th response = 8
5th response = 16
6th response = 32
...
23rd response = 8,388,608
*/
MyNewScore
contains a count of these distinct responses used to obtain the value in Score
. In my example dataset, MyNewScore
should equal 9 as there are 9 responses used to arrive at a sum of 5,442,822.
I have nested a forvalues
loop within a while
loop in Stata that successfully calculates MyNewScore
but I do not know how to replace the 0 that currently exists in the dataset with the result of my nested-loops.
Stata code used to calculate the value I'm after:
// Build a loop to create a Roland Morris Score
local score = 16
local count = 0
while `score' != 0 {
local ItemCode
forvalues i=1/24
local j = 2^(`i' - 1)
if `j' >= `score' continue, break
local ItemCode `j'
* display "`ItemCode'"
}
local score = `score' - `ItemCode'
if `score' > 1 {
local count = `count' + 1
display "`count'"
}
else if `score' == 1 {
local count = `count' + 1
display "`count'"
continue, break
}
}
How do I replace
the 0s in MyNewScore
with the output from the nested-loops? I have tried nesting these two loops in another while
loop, with a `replace' command although that simply applies the count from the first observation, to all observations in the dataset.
Upvotes: 1
Views: 3824
Reputation: 1051
I think there's an error in the value of the 23rd response, it should be 2^(23-1)
, which is 4,194,304.
The sum of the first 4 responses is 15; that's 1+2+4+8
or 2^4-1
. The sum of all 23 responses is 2^23 - 1
so the largest possible value for Score is 8,388,607.
There's no need for a loop over observations here. You start with a cloned copy of the Score variable. You loop over each response, starting from the highest down to 1. At each pass, if the current score is higher or equal to the value of the response, you count that response and you subtract the value from the score.
* Example generated by -dataex-. To install: ssc install dataex
clear
input long(SubjectID Score)
1001 5442822
1002 6406134
1003 16
1004 1
1005 19
1006 15
1007 8388607
end
clonevar x = Score
gen wanted = 0
qui forvalues i=23(-1)1 {
local response = 2^(`i'-1)
replace wanted = wanted + 1 if x >= `response'
replace x = x - `response' if x >= `response'
}
Upvotes: 1
Reputation: 887
I think all that you would need to do is nest your code in a loop that goes through each variable in your dataset, like so:
// get total number of observations in dataset
local N = _N
// go through each observation and run the while loop
forvalues observation = 1/`N' {
local score = Score[`observation']
local count = 0
// your while loop here
while `score' != 0 {
...
}
replace MyNewScore = `ItemCode' in `observation' // (or whatever value you're after)
}
Is this what you're after?
Upvotes: 0