SirMatticus
SirMatticus

Reputation: 13

Using a loop to replace values of a variable

I'm attempting to create a new variable in my dataset that stores a number, which is derived from a computation on another number from the same observation.

* Here is what my dataset looks like:
SubjectID    Score     MyNewScore
1001         5442822   0
1002         6406134   0
1003         16        0

Now, the variable Score is the sum of up to 23 distinct numbers (I'll call them "Responses"), ranging from 1 to 8,388,608.

/* Example of response values
1st response = 1
2nd response = 2
3rd response = 4
4th response = 8
5th response = 16
6th response = 32
...
23rd response = 8,388,608
*/

MyNewScore contains a count of these distinct responses used to obtain the value in Score. In my example dataset, MyNewScore should equal 9 as there are 9 responses used to arrive at a sum of 5,442,822.

I have nested a forvalues loop within a while loop in Stata that successfully calculates MyNewScore but I do not know how to replace the 0 that currently exists in the dataset with the result of my nested-loops.

Stata code used to calculate the value I'm after:

// Build a loop to create a Roland Morris Score
local score = 16
local count = 0

while `score' != 0 {

    local ItemCode
        forvalues i=1/24
            local j = 2^(`i' - 1)
            if `j' >= `score' continue, break
            local ItemCode `j'
        *   display "`ItemCode'"
        }

    local score = `score' - `ItemCode'
    if `score' > 1 {
        local count = `count' + 1
        display "`count'"
    }
    else if `score' == 1 {
        local count = `count' + 1
        display "`count'"
        continue, break
    }
}

How do I replace the 0s in MyNewScore with the output from the nested-loops? I have tried nesting these two loops in another while loop, with a `replace' command although that simply applies the count from the first observation, to all observations in the dataset.

Upvotes: 1

Views: 3824

Answers (2)

Robert Picard
Robert Picard

Reputation: 1051

I think there's an error in the value of the 23rd response, it should be 2^(23-1), which is 4,194,304.

The sum of the first 4 responses is 15; that's 1+2+4+8 or 2^4-1. The sum of all 23 responses is 2^23 - 1 so the largest possible value for Score is 8,388,607.

There's no need for a loop over observations here. You start with a cloned copy of the Score variable. You loop over each response, starting from the highest down to 1. At each pass, if the current score is higher or equal to the value of the response, you count that response and you subtract the value from the score.

* Example generated by -dataex-. To install: ssc install dataex
clear
input long(SubjectID Score)
1001 5442822
1002 6406134
1003      16
1004       1
1005      19
1006      15
1007 8388607
end

clonevar x = Score
gen wanted = 0
qui forvalues i=23(-1)1 {
    local response = 2^(`i'-1)
    replace wanted = wanted + 1 if x >= `response'
    replace x = x - `response' if x >= `response'
}

Upvotes: 1

Eric HB
Eric HB

Reputation: 887

I think all that you would need to do is nest your code in a loop that goes through each variable in your dataset, like so:

// get total number of observations in dataset
local N = _N 

// go through each observation and run the while loop
forvalues observation = 1/`N' {
    local score = Score[`observation']
    local count = 0

    // your while loop here
    while `score' != 0 {
        ...
    }

    replace MyNewScore = `ItemCode' in `observation' // (or whatever value you're after)
}

Is this what you're after?

Upvotes: 0

Related Questions