kangkan Dc
kangkan Dc

Reputation: 183

Stata Replacing Part of String

I have 10 digit long string "0000000000" called my_var. I have two variables highclass (between 0 and 10) and lowclass (between 0 and 10).

I need to convert the digits between highclass and lowclass to 1.

For example, if a row has highclass =5 and lowclass =1, then my_var should become 1111100000.

I am not sure if the substring command will help me since I need to reference a variable.

Upvotes: 0

Views: 953

Answers (1)

Nick Cox
Nick Cox

Reputation: 37183

As I understand it, lowclass is the position of the first 1 and highclass is the position of the last 1.

No loops are needed. In fact, a single statement would do it in Stata (which is the language the question is about).

Two ways to do it:

Old style (particularly pertinent to Stata 12 and below)

Here I have split the single statement into several, because I suspect it is clearer that way. Note that substr() (not substring()) is a function, not a command.

clear 
input str10 my_var lowclass highclass 
"0000000000"  1  5
"0000000000"  2  4
"0000000000"  3  3 
"0000000000"  1  10
"0000000000"  7  10 
end 

local zeros "0000000000"
local ones  "1111111111" 
replace my_var = substr("`zeros'", 1, lowclass - 1)
replace my_var = my_var + substr("`ones'", 1, highclass - lowclass + 1) 
replace my_var = my_var + substr("`zeros'", 1, 10 - highclass) 

list 

     +----------------------------------+
     |     my_var   lowclass   highcl~s |
     |----------------------------------|
  1. | 1111100000          1          5 |
  2. | 0111000000          2          4 |
  3. | 0010000000          3          3 |
  4. | 1111111111          1         10 |
  5. | 0000001111          7         10 |
     +----------------------------------+

New style (Stata 13 up)

Mata and Stata 13 up allow string multiplication, (e.g. 10 * "1") so this works:

replace my_var = (lowclass - 1) * "0" + (highclass - lowclass + 1) * "1" + (10 - highclass) * "0" 

Note that e.g. -1 * "0" is perfectly legal but evaluates as missing (empty string).

Upvotes: 2

Related Questions