Mihir Garg
Mihir Garg

Reputation: 65

How to use Regex To Filter Out Anything That Does Not Have A Comma?

So I work with discord bots, and I wanted to make a RegEx Expression that would remove any words that did not have a comma after it. Currently, this is my code:

const { content } = message;
var argsc = content.split(/[,]+/);
argsc.shift();
console.log(argsc); //returns [ 'hello', 'sky', 'hi hello there' ]

The Orriginal Message Is +template hi,hello,sky,hi hello there, and I figured out how to remove the first word. Now I want hello there to be filtered out. I want it so that the result is ['hi', 'hello', 'sky','hi']. I know its complicated, but I have tried everything and I just cant filter out hello there. Thanks!

Upvotes: 3

Views: 367

Answers (2)

Ryszard Czech
Ryszard Czech

Reputation: 18621

Use

const s = "hi,hello,sky,hi hello there";
console.log(s.split(/(?:^|,)([^\s,]+)(?:\s+[^\s,]+)*/).filter(Boolean));

See regex proof.

Expression explanation

--------------------------------------------------------------------------------
  (?:                      group, but do not capture:
--------------------------------------------------------------------------------
    ^                        the beginning of the string
--------------------------------------------------------------------------------
   |                        OR
--------------------------------------------------------------------------------
    ,                        ','
--------------------------------------------------------------------------------
  )                        end of grouping
--------------------------------------------------------------------------------
  (                        group and capture to \1:
--------------------------------------------------------------------------------
    [^\s,]+                  any character except: whitespace (\n,
                             \r, \t, \f, and " "), ',' (1 or more
                             times (matching the most amount
                             possible))
--------------------------------------------------------------------------------
  )                        end of \1
--------------------------------------------------------------------------------
  (?:                      group, but do not capture (0 or more times
                           (matching the most amount possible)):
--------------------------------------------------------------------------------
    \s+                      whitespace (\n, \r, \t, \f, and " ") (1
                             or more times (matching the most amount
                             possible))
--------------------------------------------------------------------------------
    [^\s,]+                  any character except: whitespace (\n,
                             \r, \t, \f, and " "), ',' (1 or more
                             times (matching the most amount
                             possible))
--------------------------------------------------------------------------------
  )*                       end of grouping

Upvotes: 1

Tim Biegeleisen
Tim Biegeleisen

Reputation: 521914

You may try using a regex replacement to do the cleanup, followed by a simple split:

var input = "hi,hello,sky,hi hello there";
input = input.replace(/(\S+)(?: [^\s,]+)*(?=,|$)/g, "$1");
var parts = input.split(",");
console.log(parts);

Here is an explanation of the regex pattern:

(\S+)          match a "word" AND capture it in $1
(?: [^\s,]+)*  followed by zero or more "words," making sure that we 
               don't hit a comma
(?=,|$)        match until hitting either a comma or the end of the input

Then, we replace with just the first captured word.

Upvotes: 2

Related Questions