JasonBartholme
JasonBartholme

Reputation: 142

How do I match and replace a non-word character between word characters with RegEx?

I am working with a data set that needs to be scrubbed. I am looking to replace the question marks(?) with the em-dash code(—). Here is an example string:

"...shut it down?after taking a couple of..."

I can match that instance with this expression: \w\?\w However, it matches one character on either side of the question mark. So the replace looks like this:

"...shut it dow—after taking a couple of..."

How can I match just the pattern while only replacing the question mark?

Thanks in advance, Jason

Upvotes: 0

Views: 493

Answers (4)

Paul Biggar
Paul Biggar

Reputation: 28739

Use: /\b\?\b/

\b matches word boundaries, which seems to be what you're after.

Upvotes: 2

Daniel Vandersluis
Daniel Vandersluis

Reputation: 94153

If the language you are using supports lookarounds, you could use them to make sure your question mark is surrounded by word characters, but not actually capture them:

/(?<=\w)\?(?=\w)/

The (?<=\w) is a lookbehind (the engine looks "behind" -- before -- a potential match) and the (?=\w) is a lookahead (the engine looks ahead). Lookarounds are not captured, so in your case, only the question mark will be, and then you can replace it.

In PHP, for example, you could thus do:

$string = "...shut it down?after taking a couple of..."
preg_replace('/(?<=\w)\?(?=\w)/', "&mdash;", $string);
// results in ...shut it down&mdash;after taking a couple of...

Lookarounds are supported by PCRE-based (perl compatible) regular expression engines, although Ruby doesn't support lookbehinds.

Upvotes: 3

RaYell
RaYell

Reputation: 70414

Hard to answer if we don't know which technology are you using. If you are writing a JS this will do it

inputStr.replace(/(\w)\?(\w)/, '$1&mdash;$2');

Upvotes: 2

Sean Bright
Sean Bright

Reputation: 120644

If it is PHP (I'm basing that on other questions you have asked), this should do it:

$str = preg_replace('/(\w)\?(\w)/i', '\\1&mdash;\\2', $str);

Upvotes: 3

Related Questions