TerribleStudent
TerribleStudent

Reputation: 61

Replace string at position

I have a string variable in Stata for unique id codes with different structures where I want to i) replace a specific character at a specific position (erroneous data entry) and ii) keep the last part in a new variable that starts after some

To illustrate, some of my id's are in the following structure: 000XXXX000XXXX000001234560780912340567

where the X's are string characters which of some are wrong, for example there are numbers at the 4th X (position 7) which I want to replace with the correct character.

Regarding ii), the last part is a unique number sequence which I want to keep in a new variable, the problem is that the length of this sequence varies and thus does not start at the same position. It seems however that it always starts after "00" which is where I assume makes sense to start it.

I have tried substr() and subinstr() without being able to solve it.

Upvotes: 0

Views: 131

Answers (1)

Nick Cox
Nick Cox

Reputation: 37278

clear 
set obs 1 
gen whatever = "000XXXX000XXXX000001234560780912340567"
replace whatever = substr(whatever, 1, 6) + "Y" + substr(whatever, 7, .)

gen wanted = substr(whatever, strrpos(whatever, "00") + 2, .)

di whatever[1]
di wanted[1]

Results:

. di whatever[1]
000XXXYX000XXXX000001234560780912340567

. di wanted[1]
1234560780912340567

Upvotes: 0

Related Questions