Andrejs
Andrejs

Reputation: 11961

Remove near-identical strings (noun plural forms) from list

I'm trying to identify and remove plural forms of a noun in a list. Essentially, I want the below test to pass:

@Test
public void testRemoveNounPlurals(){
    List<String> listWithDups = List.of("friend", "friends", "dog", "dogs", "serious");

    List<String> filteredList = removeDuplicates(listWithDups); // testing this method

    org.assertj.core.api.Assertions.assertThat(filteredList)
            .hasSize(3)
            .containsOnly("friend", "dog", "serious");
}

The no-quite-there implementation:

public static List<String> removeDuplicates(List<String> list) {

    List<String> maybePlurals = list.stream()
            .filter(s -> s.endsWith("s"))
            .collect(toList()); // friends, dogs, serious

    return list.stream()
             // correctly removes friends and dogs, should keep 'serious'
            .filter( word -> maybePlurals.contains(word.concat("s"))) 
            .collect(toList());
}

Upvotes: 0

Views: 170

Answers (1)

Joakim Danielson
Joakim Danielson

Reputation: 52013

This solution adds an "s" to each word and checks if it exists and if so removes it

public static List<String> removeDuplicates(List<String> list) {
    List<String> result = new ArrayList<>();
    result.addAll(list);

    for (String word : list) {
        String words = word + "s";
        result.remove(words);
    }
    return result;
}

I guess some optimisation could be done by adding

if (word.endsWith("s")) {
     continue;
}

first in the for loop

Upvotes: 1

Related Questions