Calgar99
Calgar99

Reputation: 1688

Remove Punctuation issue

Im trying to find a word in a string. However, due to a period it fails to recognize one word. Im trying to remove punctuation, however it seems to have no effect. Am I missing something here? This is the line of code I am using: s.replaceAll("([a-z] +) [?:!.,;]*","$1");

String test = "This is a line about testing tests. Tests are used to examine stuff";
    String key = "tests";
    int counter = 0;


    String[] testArray = test.toLowerCase().split(" ");

    for(String s : testArray)
    {
        s.replaceAll("([a-z] +) [?:!.,;]*","$1");
        System.out.println(s);
        if(s.equals(key))
        {
            System.out.println(key + " FOUND");
            counter++;
        }
    }

    System.out.println(key + " has been found " + counter + " times.");
}

I managed to find a solution (though may not be ideal) through using s = s.replaceAll("\W",""); Thanks for everyones guidance on how to solve this problem.

Upvotes: 0

Views: 148

Answers (3)

Charles Forsythe
Charles Forsythe

Reputation: 1861

You could also take advantage of the regex in the split operation. Try this:

String[] testArray = test.toLowerCase().split("\\W+");

This will split on apostrophe, so you may need to tweak it a bit with a specific list of characters.

Upvotes: 1

Reimeus
Reimeus

Reputation: 159854

Strings are immutable. You would need assign the result of replaceAll to the new String:

s = s.replaceAll("([a-z] +)*[?:!.,;]*", "$1");
                           ^

Also your regex requires that a space exist between the word and the the punctuation. In the case of tests., this isn't true. You can adjust you regex with an optional (zero or more) character to account for this.

Upvotes: 1

Aditya Peshave
Aditya Peshave

Reputation: 103

Your regex doesn't seem to work as you want. If you want to find something which has period after that then this will work

([a-z]*) [?(:!.,;)*]

it returns "tests." when it's run on your given string.

Also

[?(:!.,;)*]

just points out the punctuation which will then can be replaced.

However I am not sure why you are not using substring() function.

Upvotes: 0

Related Questions