Reputation: 1247
I'm trying to build a Javascript program that switches multiple variations of names with each other.
For example, if I had a string:
let string = "This is Donald Trump and I am Donald J. Trump and I have replaced Barack Obama and Obama was before me."
I would want the output to be:
newString = "This is Barack Obama and I am Barack H. Obama and I have replaced Donald Trump and Trump was before me."
My strategy was to use
let arr = string.split(regex)
in such a way that each chunk of text before and after a regex match is its own index, and each regex match is its own index too. For example:
["This is ", "Donald Trump", " and I am ", "Donald J. Trump", " and I have replaced ", "Barack Obama", " and ", "Obama", " was before me."];
Then check each item of the array to see if it needs to be "switched." For example:
for (let i = 0; i < arr.length; i++) {
// if arr[i] == Donald J. Trump, Donald Trump, or Trump, arr[i] = equivalent Obama variation
// else if arr[i] == Barack H. Obama, Barack Obama, or Obama, arr[i] = equivalent Trump variation
// else arr[i] = arr[i]
}
let newString = arr.join(" ");
htmlElement.innerHTML(newString);
Here's my regex
let regex = /\b(Barack\s)?(H\.\s)?Obama|\b(Donald\s)?(J\.\s)?Trump/;
The regex seems to correctly match all variations of the names.
However, when I write
arr = string.split(regex)
my arr looks like this:
["This is ", undefined, undefined, "Donald ", undefined, " and I am ", undefined, undefined, "Donald ", "J. ", " and I have replaced ", undefined, "Barack ", undefined, undefined, " and ", undefined, undefined, undefined, undefined, " was before me."];
Is there a way to split the string by the multiple variations of the delimiter, but also retain the delimiter in its own array item?
Upvotes: 0
Views: 150
Reputation: 22817
I took a different approach to your problem. Instead of searching for specific names I created a regex that captures full names (assuming each name begins with a capital letter and has more than 1 character or is immediately followed by a dot). I then crossreference this full name (split on spaces) against a nameEquivalents
object for the proper replacement.
Yes, I am aware that the regex will not catch special cases such as names with two-letter abbreviations, apostrophes, hyphens, starting with non-uppercase letters, etc. but the need wasn't specified by the OP (and frankly I'm not too worried about it since my regex could capture more than the OP's original regex of simply putting the names directly in it).
Also, note that the getKeyByValue
function is taken from this answer on this question.
let string = "This is Donald Trump and I am Donald J. Trump and I have replaced Barack Obama and Obama was before me."
let regex = /(?: ?\b[A-Z](?:[a-zA-Z]+\b|\.))+/g
let nameEquivalents = {
"Obama": "Trump",
"Barack": "Donald",
"H.": "J."
}
function getKeyByValue(object, value) {
return Object.keys(object).find(key => object[key] === value);
}
let newString = string.replace(regex, function(match) {
matches = match.split(" ").filter(String)
return matches.map(function(m){
if(nameEquivalents.hasOwnProperty(m)) {
return " " + nameEquivalents[m]
} else {
let v = getKeyByValue(nameEquivalents, m)
if(v) {
return " " + v
}
}
return m
}).join("")
})
console.log(newString)
(?: ?\b[A-Z](?:[a-zA-Z]+|\.))+
Match the following one or more times
?
Optionally match a space character (there's a space
before the ?
but SO doesn't actually display it there)\b
Assert position as a word boundary[A-Z]
Match an uppercase letter(?:[a-zA-Z]+\b|\.)
Match either of the following
[a-zA-Z]+\b
Match any letter one or more times ensuring it's followed by a word boundary\.
Match a literal dotUpvotes: 1
Reputation: 1980
I think the parentheses in the regex are being interpreted as capture groups and so in matches that dont fulfill all captures you are getting undefined captures.
Try removing all parenthesis and just wrapping the whole lot in a single capture.
/\b(Barack\s?H\.\s?Obama|\bDonald\s?J\.\s?Trump)/
Upvotes: 0