snocc
snocc

Reputation: 1

Efficient way to remove items in a list that contain certain words?

Very simple question, I have

List<String> forbidden_words = Arrays.asList("test","one","two");

List<String> list1 = Arrays.asList("eiofjfrgj_test","oiione","rrrtwo", "normal", "word");

I want to remove the elements in list1 that contain forbidden words so I get "normal" & "word" in list1. What's the best way to do this?

Upvotes: 0

Views: 511

Answers (4)

Metroids
Metroids

Reputation: 20477

Here's how I would write that:

List<String> forbiddenWords = Arrays.asList("test","one","two");
List<String> words = Arrays.asList("eiofjfrgj_test","oiione","rrrtwo", "normal", "word");

List<String> filtered = words.stream()
    .filter(word -> !forbiddenWords.stream().anyMatch(forbiddenWord -> word.contains(forbiddenWord)))
    .collect(Collectors.toList());

System.out.println(filtered);

Upvotes: 1

Andreas
Andreas

Reputation: 159086

Use regex and removeIf():

// Input from question
List<String> forbidden_words = Arrays.asList("test","one","two");
List<String> list1 = Arrays.asList("eiofjfrgj_test","oiione","rrrtwo", "normal", "word");

// Make list1 mutable
list1 = new ArrayList<>(list1);

// Remove forbidden words using regex, so it works case-insensitively
Pattern p = Pattern.compile(forbidden_words.stream().map(Pattern::quote).collect(Collectors.joining("|")),
                            Pattern.CASE_INSENSITIVE | Pattern.UNICODE_CASE);
list1.removeIf(s -> p.matcher(s).find());

// See result
System.out.println(list1);

Output

[normal, word]

Upvotes: 0

Sunny Patel
Sunny Patel

Reputation: 194

To begin with, most other answer answers are not efficient but are short. Below is another possible way you would do it.

public static String[] naiveFiltering(String[] forbidden_words, String[] list1 ){
        List<String> filteredList = new ArrayList<>(); //extra memory
        for(String str: list1){
            boolean containsForbidden = false;
            for(String forbidden: forbidden_words){
                if(str.contains(forbidden)){ // o(mn)
                    containsForbidden = true;
                    break;
                }
            }
            if(!containsForbidden)
                    filteredList.add(str);
        }
        String[] filteredArray = new String[filteredList.size()];
        for(int i=0;i<filteredArray.length;i++){
            filteredArray[i] = filteredList.get(i); //extra copying
        }
        return filteredArray;
    }

I am assuming that "efficient" means something more than just a naive approach like most other answers including mine.

You can do two things to make it efficient:

  • Change your representation to a LinkedList instead of an array so that you can remove from an index without having to copy over to a new list/array.

  • Use KMP for String matching which is O(N) instead of String.contains() which is a naive implementation with O(MN) worst-case complexity.

Upvotes: 0

user2670200
user2670200

Reputation:

String array is immutable (or to be precise: fixed). The only way to do that is to convert the string array to a List (using Arrays.asList()), remove the "forbidden" word and finally convert the list back to String array (using the method toArray())

Upvotes: 0

Related Questions