airplaneman19
airplaneman19

Reputation: 1159

Variable Declaration Regex

I'm trying to make a simple Ruby regex to detect a JavaScript Declaration, but it fails.

Regex:

lines.each do |line|
     unminifiedvar = /var [0-9a-zA-Z] = [0-9];/.match(line)
     next if unminifiedvar == nil #no variable declarations on the line
     #...
end

Testing Line:

var testvariable10 = 9;

Upvotes: 0

Views: 340

Answers (3)

Gareth McCaughan
Gareth McCaughan

Reputation: 19971

A variable name can have more than one character, so you need a + after the character-set [...]. (Also, JS variable names can contain other characters besides alphanumerics.) A numeric literal can have more than one character, so you want a + on the RHS too.

More importantly, though, there are lots of other bits of flexibility that you'll find more painful to process with a regular expression. For instance, consider var x = 1+2+3; or var myString = "foo bar baz";. A variable declaration may span several lines. It need not end with a semicolon. It may have comments in the middle of it. And so on. Regular expressions are not really the right tool for this job.

Of course, it may happen that you're parsing code from a particular source with a very special structure and can guarantee that every declaration has the particular form you're looking for. In that case, go ahead, but if there's any danger that the nature of the code you're processing might change then you're going to be facing a painful problem that really isn't designed to be solved with regular expressions.

[EDITED about a day after writing, to fix a mistake kindly pointed out by "the Tin Man".]

Upvotes: 5

Alex D
Alex D

Reputation: 30445

Try /var [0-9a-zA-Z]+ = \d+;/

Without the +, [0-9a-zA-Z] will only match a single alphanumeric character. With +, it can match 1 or more alphanumeric characters.

By the way, to make it more robust, you may want to make it match any number of spaces between the tokens, not just exactly one space each. You may also want to make the semicolon at the end optional (because Javascript syntax doesn't require a semicolon). You might also want to make it always match against the whole line, not just a part of the line. That would be:

/\Avar\s+[0-9a-zA-Z]+\s*=\s*\d+;?\Z/

(There is a way to write [0-9a-zA-Z] more concisely, but it has slipped my memory; if someone else knows, feel free to edit this answer.)

Upvotes: 1

Brigand
Brigand

Reputation: 86230

You forgot the +, as in, more than one character for the variable name.

var [0-9a-zA-Z]+ = [0-9];

You may also want to add a + after the [0-9]. That way it can match multiple digits.

var [0-9a-zA-Z]+ = [0-9]+;

http://rubular.com/r/kPlNcGRaHA

Upvotes: 1

Related Questions