Mark
Mark

Reputation: 31

Regex - everything except string

I would like to replace every char that is not a string for a given char like | or ; or whatever. I have simple regex patern: ([a-zA-Z])\w+

...and the problem is to replace everything except matches of that pattern.

Example: qwerty 123456 ;,.'[]?/ asd

Result: qwerty|||||||||||||||||asd

Thanks in advance.

Upvotes: 0

Views: 181

Answers (3)

ScintillatingSpider
ScintillatingSpider

Reputation: 348

  • You can negate the characters in your character set using ^.
  • \w needs to be omitted, because otherwise it will also try to match any word character
  • The brackets are not necessary, since there is no need to group a single character set and you are not using a backreference to that capturing group

This results in the following regex:

[^a-zA-Z]+

Upvotes: 0

Shakiba Moshiri
Shakiba Moshiri

Reputation: 23794

For this input:
qwerty 123456 ;,.'[]?/ asd
You want to match all non-word plus digits, so you can use [\W\d]

But since you want to replace them one-by-one not all them at once you no need to use quantifier +

Also you can use: build-in-character class if your engine or your language has them. For example:

[:alnum:]       all letters and digits
[:alpha:]       all letters
[:blank:]       all horizontal whitespace
[:cntrl:]       all control characters
[:digit:]       all digits
[:graph:]       all printable characters, not including space
[:lower:]       all lower case letters
[:print:]       all printable characters, including space
[:punct:]       all punctuation characters
[:space:]       all horizontal or vertical whitespace
[:upper:]       all upper case letters
[:xdigit:]      all hexadecimal digits

See a fest test with Perl

echo "qwerty 123456 ;,.'[]?/ asd" | perl -lpe 's/[[:cntrl:][:punct:]\d ]/|/g'  

or:

echo "qwerty 123456 ;,.'[]?/ asd" | perl -lpe 's/[\W\d]/|/g'  

has the same output:

qwerty|||||||||||||||||asd

NOTE:

For more detail you can see: Regular Expression Reference: Shorthand Character Classes

Upvotes: 0

Bernhard
Bernhard

Reputation: 1870

You can filter/match/replace in two Ways

First Variant:

[a-z0-9] // filter/match/replace everything that is included in the defined Character set

Second Variant:

[^a-z0-9] // filter/match/replace everything that is NOT included in the defined Character set

As you see, the only Difference is the ^. ^ is the negation Operator within a Character Set.

Upvotes: 1

Related Questions