Tys
Tys

Reputation: 3610

Javascript regex to match last word regardless of specials characters

We use this script to alter the last word in a sentence.

$div = $('.cPageHeader h2');
$div.html($div.text().replace(/(\w+?)$/, '<span class="cOrange">$1</span>'));

This works well as long as there are no special chars involved.

As soon as we have a header like <h2>International fancy stüff</h2> the highlighting goes wrong. Only ff will be highlighted. The same error occurs if we have one of the chars at the end of the line (!-.?).

Can someone alter the script, so that the last whole word (including attached punctuation) and regardless of any accented chars, will be highlighted?

Upvotes: 0

Views: 1995

Answers (2)

Alan Moore
Alan Moore

Reputation: 75222

This should be all you need:

$div.text().replace(/(\S+)$/, '<span class="cOrange">$1</span>')

You want to include the trailing punctuation in the match anyway, so \w+ never was the right tool for the job. And this way you don't have to deal with making it treat non-ASCII characters like ü as word characters.

Just FYI, there's no point using a reluctant quantifier like \S+?, since you're matching all the way to the end of the string. It's not incorrect in this case, just pointless.

Upvotes: 1

David Thomas
David Thomas

Reputation: 253318

I'd suggest:

$div = $('.cPageHeader h2');
$div.html($div.text().replace(/(\S+?)$/, '<span class="cOrange">$1</span>'));

JS Fiddle demo.

This basically looks for all the non white-space characters at the end of your string, although if your string ends with white-space, there'll be no highlight (so it might be worth trimming the string first, just to be sure.

The following replicates the above, but is a little more tolerant of trailing white-space:

var $div = $('#demo');
$div.html($div.text().replace(/\b(\S+?)(\b|(?:\s+))$/, '<span class="cOrange">$1</span>'));

JS Fiddle demo.

This matches:

  • \b: a word-boundary;
  • \S+: a sequence of one, or more, non white-space characters;
  • (\b|(?:\s+)): another word-boundary or a sequence of one, or more, white-space characters.

Updated once more, because the numbered-matches (the $1) from your original expression are apparently deprecated, or will soon be (though I cannot find a reference to back up that particular recollection, so perhaps take it with a pinch of salt), and to use a function instead:

var $div = $('#demo');
$div.html($div.text().replace(/\b(\S+?)(\b|(?:\s+))$/, function(a){
    return '<span class="cOrange">' + a + '</span>';
}));

JS Fiddle demo.

References:

Upvotes: 2

Related Questions