Manuel Di Iorio
Manuel Di Iorio

Reputation: 3761

split spaces with a regex only if they are not contained in a substring

I should split by spaces (or \r\n\t) a string that could contains substrings

Example:

text 'contained in' a string

I have tried with a regex like:

/[?<!\"\'](\ )*[?!\"\']/g;

string.split(regex) should returns:

["text", "'contained in'", "a", "string"]

.

But it's wrong... I'm freaking from much time to resolve it :@

For now, I made a split function that automatically split by a sepchar if there are out of substrings, but I'm looking for a simple solution with regex, if possible, obviously :P

Upvotes: 1

Views: 170

Answers (4)

Bryan Elliott
Bryan Elliott

Reputation: 4095

You could do this:

(?:'(.*)'|(\b[\w]+\b))

Working regex example:

http://regex101.com/r/oJ2nQ9

Or even better, rather than using word bounderies (because your string may contain special characters).. This would be better:

(?:'(.*?)'|(?:[\s]*|^)([^\s]+)(?:[\s]*|$))

Sample string:

text 'contained in' a string-with special's chars.

Matches:

"text", "contained in", "a", "string-with", "special's", "chars."

Working regex example:

http://regex101.com/r/iP3iJ1

Upvotes: 1

NaCl
NaCl

Reputation: 2723

Try /([\'\"][^\"\']+[\'\"])|([^\s]+)/g, simple but works fine.

http://regex101.com/r/hR3bQ8/

You can extract the substring only by using /([\'\"][^\"\']+[\'\"])/g.

Upvotes: 2

anubhava
anubhava

Reputation: 785098

You can use this in Javascript:

var s="text 'contained in' a string";
s.split(/ +(?=(?:(?:[^']*'){2})*[^']*$)/g);
//=> ["text", "'contained in'", "a", "string"]

Regex basically uses a lookahead to make sure there are even number of quotes following a space.

Upvotes: 1

p.s.w.g
p.s.w.g

Reputation: 149020

It looks like you were trying to use lookarounds like this:

/(?<!\"\')(\ )*(?!\"\')/

However, JavaScript does not support lookbehinds ((?<=...) or (?<!...)) so you'll need a different strategy. Any capturing groups within the pattern you're splitting by will be returned in the result array, so splitting like this will get you close to the result you want:

var input = "text 'contained in' a string";
var output = input.split(/('[^']*')|\s/);
console.log(output); // ["text", undefined, "", "'contained in'", "", undefined, "a", undefined, "string"]

Now the only problem is what do you do about those undefined and empty strings? You can use the filter method from ES5, like this:

var input = "text 'contained in' a string";
var output = input.split(/('[^']*')|\s/).filter(function(s) { return s && s.length; });
console.log(output); // ["text", "'contained in'", "a", "string"]

Upvotes: 1

Related Questions