Reputation: 321
I want to find all occurrences of a variable name in a file, let's say variable test
:
int test;
but i don't want to match the variable name when it's inside a string, like
String s = "This is a test!";
I tried ([^\"])([a-zA-Z_$][\\w$]*)([^\"])
, but it won't work.
Upvotes: 0
Views: 588
Reputation: 23047
Maybe it is an idea to temporarily cut all string out of the source code and then search for the variable name.
Assuming the source code is valid (no syntax errors), you can cut everything from the first occuring double quote (") to the next double quote.
Notice that variable names with just one character (like d
) will require some additional code, for d
is also used for forcing the compiler as interpreting the preceding number as a double (e.g. double dbl = 6d
).
EDIT: I was assuming that you wanted to build an application or piece of code which lightweight-checked for variable names.
If you work inside an editor, I recommend you to use an advanced editor like Netbeans or Eclipse.
Otherwise, if you want to also check for correct syntax, you'll need to build your own interpreter (or download some from internet).
Upvotes: 0
Reputation: 1063
I’m afraid Regular Expressions are not the best fit for your problem. Since there are a lot of semantics to consider when parsing source code, it is very unlikely that you can come up with a reliable expression, that doesn’t get confused by things like escaped quotes within strings.
A better way to parse source code (and reliably detect things like variable names) is to use a generated parser, that knows about the grammar of the file to parse. SableCC is designed for this and it also conveniently provides a grammar file for Java 1.5.
It will basically tokenize the given source code and add type information to each token. This way you can simply iterate over all tokens and rebuild the source while replacing every token that matches your search term and is of type variable.
Upvotes: 2
Reputation: 33928
As I said in the comment, generally using regex for this is not a good idea. You should use some kind of parer for this.
But anyway here is a simple hack that will work for some cases:
(?xm) \b test \b
(?=
(?:[^\n"\\]+|\\.)*
(?:(?:"(?:[^\n"\\]+|\\.)*){2})*
$
)
Java quoted:
"(?m)\\btest\\b(?=(?:[^\n"\\\\]+|\\\\.)*(?:(?:"(?:[^\n"\\\\]+|\\\\.)*){2})*$)"
Some comments and other things will break it.
Upvotes: 1