Orest Dymarchuk
Orest Dymarchuk

Reputation: 51

How to remove multiple words from a string Java

I'm new to java and currently, I'm learning strings.

How to remove multiple words from a string?

I would be glad for any hint.

class WordDeleterTest {
    public static void main(String[] args) {
        WordDeleter wordDeleter = new WordDeleter();

        // Hello
        System.out.println(wordDeleter.remove("Hello Java", new String[] { "Java" }));

        // The Athens in
        System.out.println(wordDeleter.remove("The Athens is in Greece", new String[] { "is", "Greece" }));
    }
}

class WordDeleter {
    public String remove(String phrase, String[] words) {
        String[] array = phrase.split(" ");
        String word = "";
        String result = "";

        for (int i = 0; i < words.length; i++) {
            word += words[i];
        }
        for (String newWords : array) {
            if (!newWords.equals(word)) {
                result += newWords + " ";
            }
        }
        return result.trim();
    }
}

Output:

Hello
The Athens is in Greece

I've already tried to use replacе here, but it didn't work.

Upvotes: 0

Views: 943

Answers (2)

queeg
queeg

Reputation: 9473

Programmers often do this:

String sentence = "Hello Java World!";
sentence.replace("Java", "");
System.out.println(sentence);

=> Hello Java World

Strings are immutable, and the replace function returns a new string object. So instead write

String sentence = "Hello Java World!";
sentence = sentence.replace("Java", "");
System.out.println(sentence);

=> Hello World!

(the whitespace still exists)

With that, your replace function could look like

public String remove(String phrase, String[] words) {
    String result = phrase;
    for (String word: words) {
        result = result.replace(word, "").replace("  ", " ");
    }
    return result.trim();
}

=> Hello World!

(the whitespace is curated)

Now this solution will remove all occurrences of your word within the phrase - whether it is a word or part of a word. As the OP commented, removing "is" from "This is Sparta" will result in "Th Sparta". To get around that make sure the word to be replaced is embedded between whitespace characters. This is a perfect situation to switch to regular expressions.

public String remove(String phrase, String[] words) {
    String result = phrase;
    for (String word: words) {
        String regexp = "\\s" + word + "\\s";
        result = result.replaceAll(regexp, " ");
    }
    return result.trim();
}

For explanation:

The pattern sequence \s resembles a whitespace (space, tab, linefeed, ...). The double backslash is necessary for the Java compiler to not interprete a single backslash as escape character for something else. So the regular expression matches the word including the whitespaces before and after the word, and replaceAll is instructed to replace that match with a single space. Which also means the second call to remove double blanks is unnecessary now.

Here is a nice tutorial: https://docs.oracle.com/javase/tutorial/essential/regex/

Upvotes: 3

Michail Alexakis
Michail Alexakis

Reputation: 1595

You can do it using streams:

String phrase = ...;
List<String> wordsToRemove = ...;
        
String result = Arrays.stream(phrase.split("\s+"))
     .filter(w -> !wordsToRemove.contains(w))
     .collect(Collectors.joining(" "));   

Upvotes: 4

Related Questions