C. W.
C. W.

Reputation: 25

How to find strings beginning with X?

I am trying to identify strings that begin with X using the function regexm() in Stata.

My code:

for var lookin: count if regexm(X, "X")

I have tried using double quotes, square brackets, adding the options for the other characters in the string X[0-9][0-9] etc. but to no avail.

I expect the resultant number to be about 1000, but it returns 0.

Upvotes: 0

Views: 1055

Answers (2)

Nick Cox
Nick Cox

Reputation: 37368

for in Stata is ancient and now undocumented syntax, unless you are using a very old version of Stata, in which case you would be better flagging that.

X is the default loop element which is substituted everywhere it is found.

Hence your syntax -- looping over a single variable -- reduces to

count if regexm(lookin, "lookin") 

and even without a data example we can believe that the answer is 0.

This would be legal and is closer to what you seek:

for Y in var lookin : count if regexm(Y, "X")

but the regular expression is wrong, as @Pearly Spencer points out.

Incidentally,

count if strpos(lookin, "X") == 1 

is a direct alternative to your code.

In any Stata that supports regexm() you should be looping with foreach or forvalues.

Upvotes: 1

user8682794
user8682794

Reputation:

The following works for me:

clear
input str22 foo
"Xhello"
"this is a X sentence"
"X a silly one"
"but serves the purpose"
end

generate tag = strmatch(foo, "X*")

list

     +------------------------------+
     |                    foo   tag |
     |------------------------------|
  1. |                 Xhello     1 |
  2. |   this is a X sentence     0 |
  3. |          X a silly one     1 |
  4. | but serves the purpose     0 |
     +------------------------------+

count if tag
2

This is the regular expression solution based on the above example:

generate tag = regexm(foo, "^X")

Upvotes: 2

Related Questions