treecoder
treecoder

Reputation: 45091

A Javascript regex needed

Suppose I have a string like 'a b xy c pq', and I need to write a regex so that the match() method returns an array liek this: ['a','b','xy','c','pq'].

Let's see some examples:

>>> 'x y z '.match(/\w{1,3}\s+/g)
['x ', 'y ', 'z ']

>>> 'x y z '.match(/(\w{1,3})\s+/g)
['x ', 'y ', 'z ']

As you can see, even when I add parentheses, it returns the same result. I want the result to be without the white spaces. Also it'd be nice to not have to add the ending white space in the source string.

Where do you think I can improve this regex to get what I want?

Note that I don't want to use exec() because it has to be run multiple times to get all the matches.

Also note that this problem could be easily solved with split()

>>> 'x y z'.split(/\s/)
['x', 'y', 'z']

>>> "a b xy c pq".split(/\s/)
['a','b','xy','c','pq']

But, I need to also validate the string. It should only have a max three characters in each match, and each match should be an alphanumeric word with no special characters. Hence I can not use split(), because in that case I'd have to validate each match separately. I want to do it all via a single regex.

The reason I want to do the validation and splitting in a single regex is because I need to do a lot of these within an event, hence I need to make it as fast as possible.

Upvotes: 2

Views: 96

Answers (2)

MDEV
MDEV

Reputation: 10838

The regex: /(?!.*?\w{4,}.*?)\b\w{1,3}\b/g seems to work for me

var tests = ['x y z ','a b xy c pq','x y abcdefgh ','abc d rar','rawr'];

for(var i=0,c=tests.length;i<c;i++)
{
    var str = tests[i];
    console.log(str.match(/(?!.*?\w{4,}.*?)\b\w{1,3}\b/g));
}

/*
Results:

["x", "y", "z"]
["a", "b", "xy", "c", "pq"]
null
["abc", "d", "rar"]
null
*/

Upvotes: 0

falsetru
falsetru

Reputation: 369094

By appending \s+, the pattern matches space characters.

Remove \s+:

'x y z '.match(/\w{1,3}/g)
// => ["x", "y", "z"]

Using \b, you can match at word boundary.

'x y z '.match(/\b\w{1,3}\b/g)
// => ["x", "y", "z"]
'x yyyyyyyyyy z '.match(/\b\w{1,3}\b/g)
// => ["x", "z"]

Upvotes: 2

Related Questions