migcat
migcat

Reputation: 29

How to find duplicate words in a string?

hello would like to save the repeated elements of a paragraph in an array, to later validate the words that are rude, I have a function where I can convert the paragraph into an array but I don't know how to save the repeated words in an array

function awa(parrafo, palabrasNoPermitidas) {
    let array1 = parrafo.split(" ");

console.log(array1);
}

awa("hello I'm a bunny, if you are a bunny u are bunny too", ["ball"]);

Upvotes: 0

Views: 530

Answers (2)

Kinglish
Kinglish

Reputation: 23654

This will take that split() array and filter it to only contain duplicates.

parrafo.split(" ").filter((e,i,a) => e.trim() && a.indexOf(e) !== i);

the filter basically says filter each array item and see if it is empty first, then if it's array index is the same as the first instances' array index. If it's not the same, it is a duplicate so return true

function awa(parrafo, palabrasNoPermitidas) {
    return parrafo.split(" ").filter((e,i,a) => e.trim() && a.indexOf(e) !== i);
}

let repeated = awa("hello    I'm a    bunny, if you are a bunny u are bunny too", ["ball"]);
console.log(repeated)

Upvotes: 2

customcommander
customcommander

Reputation: 18901

Let's first have a look at a few scenarios:

This is the best case scenario:

"bunny bunny".split(" ")
//=> ["bunny", "bunny"]

However if there are internal, leading and/or trailing spaces we need to exclude empty strings as duplicates:

"    bunny     bunny    ".split(" ")
//=> ["", "", "", "", "bunny", "", "", "", "", "bunny", "", "", "", ""]

Most people would use punctuation. In this case it gets more complicated. As you can see there are no duplicates here:

"bunny, bunny and bunny!".split(" ")
//=> ["bunny,", "bunny", "and", "bunny!"]

So you should probably split by "word boundary":

"bunny, bunny and bunny!".split(/\b/)
//=> ["bunny", ", ", "bunny", " ", "and", " ", "bunny", "!"]

Which would still work with internal, leading and/or trailing spaces:

"     bunny,     bunny     and    bunny!   ".split(/\b/)
//=> ["     ", "bunny", ",     ", "bunny", "     ", "and", "    ", "bunny", "!   "]

With that in mind:

"hello I'm a bunny, if you are a bunny u are bunny too".split(/\b/).filter((x,i,xs) => (/^[a-z]+$/i).test(x) && xs.indexOf(x) != i)
//=> ["a", "bunny", "are", "bunny"]

Upvotes: 1

Related Questions