Kyril Khaltesky
Kyril Khaltesky

Reputation: 85

Java string replaceAll regex

Hi I want to remove certain words from a long string, there problem is that some words end with "s" and some start with a capital, basically I want to turn:

"Hello cat Cats cats Dog dogs dog fox foxs Foxs"

into:

"Hello"

at the moment I have this code but I want to improve on it, thanks in advance:

                    .replace("foxs", "")
                    .replace("Fox", "")
                    .replace("Dogs", "")
                    .replace("Cats", "")
                    .replace("dog", "")
                    .replace("cat", "")

Upvotes: 2

Views: 19508

Answers (4)

King Midas
King Midas

Reputation: 1699

Maybe you can try to match everything except the word Hello. Something like:

string.replaceAll("(?!Hello)\\b\\S+", "");

You can test it in this link.

The idea is to perform a negative lookahead for Hello word, and get any other word present.

Upvotes: 3

Wes
Wes

Reputation: 11

So you could pre-compile a list of the words you want and make it case insensitive something like:

    String str = "Hello cat Cats cats Dog dogs dog fox foxs Foxs";
    Pattern p = Pattern.compile("fox[s]?|dog[s]?|cat[s]?", Pattern.CASE_INSENSITIVE);
    Matcher m = p.matcher(str);
    String result = m.replaceAll("");
    System.out.println(result);

[s]? handles if there is a plural form, where the ? character will match 0 or 1

Upvotes: 0

Tamas Rev
Tamas Rev

Reputation: 7166

You can generate patterns that match all combinations for a word. I.e. for dog you need the pattern [Dd]ogs?:

  • [Dd] is a character class that matches both cases
  • s? matches zero or one s
  • the rest of the word will be case sensitive. I.e. dOGS will not be a match.

This is how you can put it together:

public static void main(String[] args) {
    // it's easy to add any other word
    String original = "Hello cat Cats cats Dog dogs dog fox foxs Foxs";
    String[] words = {"fox", "dog", "cat"};
    String tmp = original;
    for (String word : words) {
        String firstChar = word.substring(0, 1);
        String firstCharClass = "[" + firstChar.toUpperCase() + firstChar.toLowerCase() + "]";
        String patternSrc = firstCharClass + word.substring(1) + "s?"; // [Ww]ords?
        tmp = tmp.replaceAll(patternSrc, "");
    }
    tmp = tmp.trim(); // to remove unnecessary spaces 
    System.out.println(tmp);
}

Upvotes: 0

Tim Biegeleisen
Tim Biegeleisen

Reputation: 521239

Try this:

String input = "Hello cat Cats cats Dog dogs dog fox foxs Foxs";
input = input.replaceAll("(?i)\\s*(?:fox|dog|cat)s?", "");

Demo

Upvotes: 7

Related Questions