Jason
Jason

Reputation: 279

Struggling with Regex

I would like to split a text but keep "a-zA-z" and "'" (single quote).

I need this:

let str = "-I'm (going crazy with) this*, so I'%ve ^decided ?(to ask /for help. I hope you'll_ help me before I go crazy!"

To be this:

let arr = ["i'm", "going", "crazy", "with", "this", "so", "I've", "decided", "to", "ask", "for", "help", "I", "hope", "you'll", "help", "me", "before", "I", "go", "crazy"]

Currently I have this:

function splitText(text) {
    let words = text.split(/\s|\W/);
    return words;
}

Obviously, this won't keep "I'm" nor "you'll", for example, which is what I need. I've tried a few combinations with W$, ^W and so on, but with not success.

All I want to keep is letters and "'" wherever there's a declination.

Help! Thanks!

Upvotes: 1

Views: 46

Answers (1)

Wiktor Stribiżew
Wiktor Stribiżew

Reputation: 626929

You can use

let str = "-I'm (going crazy with) this*, so I'%ve ^decided ?(to ask /for help. I hope you'll_ help me before I go crazy!";
str = str.replace(/[^a-zA-Z0-9\s']+/g, '').split(/\s+/);
console.log(str);
// => [ "I'm", "going", "crazy", "with", "this", "so", "I've", "decided", "to", "ask", "for", "help", "I",
//   "hope", "you'll", "help", "me", "before", "I", "go", "crazy" ]

NOTES:

  • .replace(/[^a-zA-Z0-9\s']+/g, '') - removes all chars other than letters, digits, whitespace and single quotation marks
  • .split(/\s+/) - split with one or more whitespace chars.

Also, if you want to only keep ' between word chars, you may use an enhanced version of the first regex:

/[^a-zA-Z0-9\s']+|\B'|'\B/g

See the regex demo with an input containing ' not in the middle of the words.

Upvotes: 2

Related Questions