Java Nerd
Java Nerd

Reputation: 968

Removing Stop Words from List of Strings

I have a list of strings and I want to remove some stop words from this list:

for (int i = 0; i < simple_title.getItemCount(); i++) {
    // split the phrase into the words
    String str = simple_title.getItem(i);
    String[] title_parts = str.split(" ");
    ArrayList<String> list = new ArrayList<>(Arrays.asList(title_parts));
    for (int k = 0; k < list.size(); k++) {
        for (int l = 0; l < StopWords.stopwordslist.length; l++) {
            // stopwordslist is a Static Variable in class StopWords
            list.remove(StopWords.stopwordslist[l]);
        }
    }

    title_parts = list.toArray(new String[0]);
    for (String title_part : title_parts) {
        // and here I want to print the string
        System.out.println(title_part);
    }
    Arrays.fill(title_parts, null);
}

The problem is that after the removal of stop words I am getting the only first index of the title_part, e.g. if I have a list of strings such as:

 list of strings
 i am a list
 is remove stop there list...

after the removal of stop words I am only getting:

 list
 list
 remove

But what I should get is:

  list strings
  list
  remove stop list

I have been working on this but now I'm confused can somebody tell me please what I am doing wrong?

Upvotes: 0

Views: 2498

Answers (1)

Mena
Mena

Reputation: 48434

You are removing items from your List at an index defined by the iteration of your StopWords array!

So the removal is arbitrary to say the least, and would ultimately depend on the size of your stop words.

Here's a self-contained example of what you might want to do instead:

// defining the list of words (i.e. from your split)
List<String> listOfWords = new ArrayList<String>();
// adding some examples here (still comes from split in your case)
listOfWords.addAll(Arrays.asList("list", "of", "strings", "i", "am", "a", "list", "is", "remove", "stop", "there", "list"));
// defining an array of stop words (you probably want that as a constant somewhere else)
final String[] stopWords = {"of", "i", "am", "a", "is"};
// printing un-processed list
System.out.printf("Dirty: %s%n", listOfWords);
// invoking removeAll to remove all stop words
listOfWords.removeAll(Arrays.asList(stopWords));
// printing "clean" list
System.out.printf("Clean: %s%n", listOfWords);

Output

Dirty: [list, of, strings, i, am, a, list, is, remove, stop, there, list]
Clean: [list, strings, list, remove, stop, there, list]

Upvotes: 1

Related Questions