user2648694
user2648694

Reputation: 91

Regex replace non-word except dash

I have a regex pattern (\W|_)[^-] doesn't work for h_e.l_l.o - w_o.r_d (replacement string is " ").

It returns something like this:

h      w   

I hope to see at least something like this:

h e l l o - w o r d

How can I replace all non-word characters and _ excluding the - symbol?

Upvotes: 9

Views: 11140

Answers (1)

Wiktor Stribiżew
Wiktor Stribiżew

Reputation: 627488

To match any non-word char except dash (or hyphen) you may use

[^\w-]

However, this regular expression does not match _.

You need a negated character class that matches any char other than letters, digits and hyphens:

/[^-a-zA-Z0-9]+/

or (with a case insensitive modifier):

/[^-a-z0-9]+/i

See demo.

Note that the - is placed at the character class start, and does not require escaping.

You may add a plus at the end to match all the unwanted characters at a stretch to remove them in one go.

If you want to make your pattern Unicode aware (that is, in some regex flavors, if you use shorthand character classes with/without some flags, they will also match all Unicode counterparts), you may use

/[^\w-]|_/

See the regex demo (or /(?:[^\w-]|_)+/ to grab the whole chunk of these chars).

Here, [^\w-] matches any char that is not a word char (letter, digit, or underscore) and the second alternative _ matches underscores.

Upvotes: 17

Related Questions