Lluis
Lluis

Reputation: 51

Why doesn't List.contain match substrings?

I have a program made where I add a +1 in the counter if the word matches the words I have in the list

For example if I have the word [OK, NICE] and I am looking at a word (the sentence is a split with a space).

In the split I don't want to put an option for commas and points, I just want a space like this now

private static int contWords(String line, List<String> list) {

String[] words= line.split(" ");
int cont = 0;

for (int i = 0; i < words.length; i++) {
    if (list.contains(words[i].toUpperCase())) {
        cont++;
    }
}
return cont;
}

This would be an example of words that don't add +1 to the counter and should

OK = true

OKEY= false

NICE. = false

NICE, = false

Upvotes: 0

Views: 64

Answers (1)

EDToaster
EDToaster

Reputation: 3180

The problem

Here is the problem you are trying to solve:

  • Take a list of target words
  • A sentence
  • Count number of occurrences of the target words in the words of the sentence

Suppose you are looking for 'OK' and 'NICE' in your sentence, and your sentence is "This is ok, nice work!", the occurances should be 2.

Options

You have a few options, I am going to show you the way using Streams

Solution

private static int countWords(String sentence, List<String> targets) {
    String[] words = sentence.split(" ");
    return (int) Stream.of(words)
            .map(String::toUpperCase)
            .filter(word -> targets.stream().anyMatch(word::contains))
            .count();
}

How does it work?

Firstly, you take in a sentence, then split it into an array (You have done this already)

Then, we take the array, then use map to map every word to its uppercase form. This means that every word will now be in all caps.

Then, using filter we only keep the words that exist, as a substring, in the target list.

Then, we just return the count.

More in depth?

I can go through what this statement means in more detail:

.filter(word -> targets.stream().anyMatch(word::contains))

word -> ... is a function that takes in a word and outputs a boolean value. This is useful because for each word, we want to know whether or not it is a substring of the targets.

Then, the function will compute targets.stream().anyMatch(word::contains) which goes through the target stream, and tells us if any of the words in it contain (as a substring) our word that we are filtering.

NINJA EDIT:

In your original question, if the sentence was "This is Okey, nice work!" and the target list was ["OK", "OKEY"], it would have returned 2.

If this is the behaviour you want, you can change the method to:

private static int countWords(String sentence, List<String> targets) {
    String[] words = sentence.split(" ");
    return Stream.of(words)
            .map(String::toUpperCase)
            .map(word -> targets.stream().filter(word::contains).count())
            .reduce(0L, Long::sum)
            .intValue();
} 

NINJA-IER EDIT:

Based on the other question proposed in the comments, you can replace all matched words with "***" by doing the following:

private static String replaceWordsWithAsterisks(String sentence, List<String> targets) {
    String[] words = sentence.split(" ");
    List<String> processedWords = Stream.of(words)
            .map(word -> targets.stream().anyMatch(word.toUpperCase()::contains) ? "***" : word)
            .collect(Collectors.toList());

    return String.join(" ", processedWords);
}

Upvotes: 2

Related Questions