Reputation: 25
I am trying to identify strings that begin with X
using the function regexm()
in Stata.
My code:
for var lookin: count if regexm(X, "X")
I have tried using double quotes, square brackets, adding the options for the other characters in the string X[0-9][0-9]
etc. but to no avail.
I expect the resultant number to be about 1000
, but it returns 0
.
Upvotes: 0
Views: 1055
Reputation: 37368
for
in Stata is ancient and now undocumented syntax, unless you are using a very old version of Stata, in which case you would be better flagging that.
X
is the default loop element which is substituted everywhere it is found.
Hence your syntax -- looping over a single variable -- reduces to
count if regexm(lookin, "lookin")
and even without a data example we can believe that the answer is 0.
This would be legal and is closer to what you seek:
for Y in var lookin : count if regexm(Y, "X")
but the regular expression is wrong, as @Pearly Spencer points out.
Incidentally,
count if strpos(lookin, "X") == 1
is a direct alternative to your code.
In any Stata that supports regexm()
you should be looping with foreach
or forvalues
.
Upvotes: 1
Reputation:
The following works for me:
clear
input str22 foo
"Xhello"
"this is a X sentence"
"X a silly one"
"but serves the purpose"
end
generate tag = strmatch(foo, "X*")
list
+------------------------------+
| foo tag |
|------------------------------|
1. | Xhello 1 |
2. | this is a X sentence 0 |
3. | X a silly one 1 |
4. | but serves the purpose 0 |
+------------------------------+
count if tag
2
This is the regular expression solution based on the above example:
generate tag = regexm(foo, "^X")
Upvotes: 2