maja
maja

Reputation: 18034

Regex-Groups in Javascript

I have a problem using a Javascript-Regexp.

This is a very simplified regexp, which demonstrates my Problem:

(?:\s(\+\d\w*))|(\w+)

This regex should only match strings, that doesn't contain forbidden characters (everything that is no word-character).

The only exception is the Symbol +
A match is allowed to start with this symbol, if [0-9] is trailing. And a + must not appear within words (44+44 is not a valid match, but +4ad is)

In order to allow the + only at the beginning, I said that there must be a whitespace preceding. However, I don't want the whitespace to be part of the match.

I tested my regex with this tool: http://regex101.com/#javascript and the resultig matches look fine.

There are 2 Issues with that regexp:

My Questions:

Here's my JS-Code:

var input =  "+5ad6  +5ad6 sd asd+as +we";
var regexp = /(?:\s(\+\d\w*))|(\w+)/g;
var tokens = input.match(regexp);
console.log(tokens);

Upvotes: 1

Views: 156

Answers (1)

Bergi
Bergi

Reputation: 664196

How should the regex look like?

You've got multiple choices to reach your goal:

  • It's fine as you have it. You might allow the string beginning in place of the whitespace as well, though. Just get the capturing groups (tokens[1], tokens[2]) out of it, which will not include the whitespace.
  • If you didn't use JavaScript, a lookbehind could help. Unfortunately it's not supported.
  • Require a non-word-boundary before the +, which would make every \w character before the + prevent the match:

    /\B\+\d\w+|\w+/
    

Why does this regex add the space to the matches?

Because the regex does match the whitespace. It does not add the \s(\+\d\w+) to the captured groups, though.

Upvotes: 2

Related Questions