Dave McNulla
Dave McNulla

Reputation: 2016

What is the difference between "[^"]*" and ".+"

What is the difference between using the two following statements in a cucumber step definition? When I tested them in Rubular, they both worked in all the cases I could imagine. In the case of the second, my syntax highlighting is more likely to look good (no extra double quote to mess things up).

Even in the Stack Overflow syntax highlighting, it gets goofed up on the first. What are the advantages of the more common first example?

Given /^My name is "([^"]*)"$/ do |myname|
Given /^My name is "(.+)"$/ do |myname|

Upvotes: 2

Views: 130

Answers (4)

lotusphp
lotusphp

Reputation: 227

[^"]* means N (N>=0) characters except "

.+ means N (N>0) characters including "

If the subject is more than 1 char, and without quote mark("), the two regex patterns are equal.

But, consider this string: My name is "special_name_contain_"_laugh" Run your pattern again, they are NOT the same :)

Upvotes: 6

xdazz
xdazz

Reputation: 160953

[^"] means any char except "

. means any char.

* means match any times include 0.

+ means match at least 1 time.

Upvotes: 1

Sean Vieira
Sean Vieira

Reputation: 160073

The first will not break when provided the following:

My name is "Henry James" and some other condition is "something else"

The first regular expression limits the characters inside the quoted string to non-quote characters - thus it will only pick up Henry James. The second regular expression matches a quote character followed by anything else (including other quote characters) and then an ending quote character - so myname in the second case would be:

Henry James" and some other condition is "something else

Which means that you can have only one quoted value in your test case - which is far more limiting than the limitation of the first regular expression (you can only have quoted values that do not contain a quote character).

Upvotes: 5

Burhan Khalid
Burhan Khalid

Reputation: 174728

I'm not a ruby guru, however the first regular expression means:

  • ^ Start of line
  • My name is " the literal string followed by a "
  • ( starts a capture group
  • [ starts a character class
  • ^" in a character class, ^ means "not", so in this case it means anything but a "
  • ] end of character class
  • * any number of the preceding, including 0 matches
  • " a quote character
  • $ end of line

The second one, everything is the same as above, except in the character class [] you have:

  • . "any character"
  • + one or more of the preceding

The difference between + and * is that + requires atleast one of the preceeding, but * will also match if there are zero of the preceding.

Upvotes: 1

Related Questions