Reputation: 53
How do I make a regex to take a string of names and title case all names in it except for the following patterns, which should be left alone: [\-\ ][A-Z][a-z]{1,2}[A-Z]
and [\-\ ][v][ao][n]
for use in JavaScript?
That is, ignore McD
, MacD
, -McD
, -MacD
, von
and van
? That is, I want to "fix" names typed in in jumbled case, like LaToNYA von fRANKENSTEIN McDONALD-MacINTOSH
to be LaTonya von Frankenstein McDonald-MacIntosh
.
I use the following for "title casing" (capitalizing the first letter of each name and lower casing the rest of the name):
name = name.replace(/\b\w+/g, function(txt){return txt.charAt(0).toUpperCase() + txt.substr(1).toLowerCase();});
This, when applied to the name above would result in Latonya Von Frankenstein Mcdonald-Macintosh
, which is not desirable, especially if the person entering their name typed LaTonya
, von
, McDonald
and MacIntosh
and it is changed against their wishes. How can I adjust my replace to leave the patterns given as regex above alone (if the user types latonya
, MACDONALD
, or VON
, then I have no problem changing to Latonya
, Macdonald
, or Von
)?
Upvotes: 1
Views: 55
Reputation: 626952
You may use
var name = "LaToNYA von fRANKENSTEIN McDONALD-MacINTOSH";
var expected = "LaTonya von Frankenstein McDonald-MacIntosh";
name = name.replace(/\b(v[ao]n|[A-Z][a-z]{1,2}[A-Z])?(\w*)/g, function($0,$1,$2) {
return $1 ? $1 + $2.toLowerCase() :
$0.charAt(0).toUpperCase() +
($0.length > 1 ? $0.substr(1).toLowerCase() : "");
});
console.log(name, " => " , (expected === name ? "identical" : "different"));
Details
\b
- a word boundary(v[ao]n|[A-Z][a-z]{1,2}[A-Z])?
- Group 1 capturing one or zero occurrences of
v[ao]n
- von
or van
|
- or[A-Z][a-z]{1,2}[A-Z]
- an uppercase ASCII letter, 1 or 2 lowercase ones, and an uppercase ASCII letter again(\w*)
- Group 2 capturing zero or more word charsThe $0,$1,$2
stand for the whole match, Group 1 and Group 2 values.
Upvotes: 1