Reputation: 11
Been looking for a couple of days now and still could not get my head around it.
This is the phrase,
const phrase = `"That's the password: 'PASSWORD 123'!", cried the Special Agent.\nSo I fled.`;
and this is the expected transformation,
['that's', 'the', 'password', 'password', '123', 'cried', 'the', 'special', 'agent', 'so', 'i', 'fled']
The first element (that's) of the array is the problem area.
I can only get below transformation,
['thats', 'the', 'password', 'password', '123', 'cried', 'the', 'special', 'agent', 'so', 'i', 'fled']
Using below code
const cleanPhrase = phrase.replace(/["':!,.]/g, '').replace(/[\n]/g, ' ').toLocaleLowerCase()
const words = cleanPhrase.split(' ');
Is there a way to ignore the single quotes on 'Password 123'
but accept the single quote on that's
?
Upvotes: 1
Views: 127
Reputation: 1101
First i think it is better to use String.prototype.match() instead of split.
Then there is 2 simple methods for that:
const phrase = `"That's the password: 'PASSWORD 123'!", cried the Special Agent.\nSo I fled.`;
console.log(phrase.match(/(?!')[\w']*\w/g));
const phrase = `"That's the password: 'PASSWORD 123'!", cried the Special Agent.\nSo I fled.`;
console.log(phrase.match(/(?!')[\w']+(?<!')/g));
\w
= [a-zA-Z0-9_]
[\w']
a character set/class + '
*
Zero or more length (Of the set)+
One or more length (Of the set)(?!')
Check if in first of your ahead is not a '
(?<!')
Check if in last of your behind is not a '
Note: In first method [\w']*
can be zero or more so for checking ahead of that, i use a char length class (\w
) without the quote '
to i can avoid of using negative look-behind and also support even one character words like I
Upvotes: 1
Reputation: 626903
You can use a short solution like
const phrase = `"That's the password: 'PASSWORD 123'!", cried the Special Agent.\nSo I fled.`;
console.log(phrase.match(/\w+(?:'\w+)*/g).map(x=>x.toLowerCase()));
See the regex demo.
The /\w+(?:'\w+)*/g
regex matches all occurrences (g
flag stands for gloval) of one or more word chars followed with zero or more sequences of '
and one or more word chars.
Upvotes: 0
Reputation: 10235
First replace all the symbols with an empty string,
then replace '
, '
and \n
with a single space:
const phrase = `"That's the password: 'PASSWORD 123'!", cried the Special Agent.\nSo I fled.`;
const words = phrase.replace(/["!.:,]/g, '')
.replace(/\s\'|\'\s|\n/g, ' ')
.toLocaleLowerCase().split(' ');
console.log(words);
You could also use split instead of the second replace:
const phrase = `"That's the password: 'PASSWORD 123'!", cried the Special Agent.\nSo I fled.`;
const words = phrase.toLocaleLowerCase()
.replace(/["!.:,]/g, '')
.split(/\s\'|\'\s|\n|\s/g);
console.log(words);
Upvotes: 0
Reputation: 33933
I would first replace all quotes that are surrounded by letters with a "placeholder"... That is a character that should not appear in the string. I used a pipe (|
) in the example below.
const phrase = `"That's the password: 'PASSWORD 123'!", cried the Special Agent.\nSo I fled.`;
const cleanPhrase = phrase
// Replace all quotes with a placeholder
.replace(/(\w)'(\w)/, "$1|$2")
.replace(/["':!,.]/g, "")
.replace(/[\n]/g, " ")
// Restore the quotes where there is a placeholder
.replace(/(\w)\|(\w)/, "$1'$2")
.toLocaleLowerCase();
const words = cleanPhrase.split(" ");
console.log(words);
Upvotes: 0