Reputation: 23
I'm using HashMap to count all word instances in an article, I'm trying to remove all non-word characters except spaces(because they're already removed with .split()). Is there a way to not repeat "pWord = pWord.replace(...);" every time and instead loop through and pass different arguments inside parentheses?
pWord = pWord.replace('"', '\"');
pWord = pWord.replace("–", "");
pWord = pWord.replace("\"", "");
pWord = pWord.replace(".", "");
pWord = pWord.replace("-", "");
Upvotes: 0
Views: 131
Reputation: 1289
One way to achieve this is to use replaceAll
with regex. Here is the sample code with regex for characters that you are replacing in your code:
String pWord = "-asdf\\\\adf.asdf\"";
System.out.println(pWord.replaceAll("[(\")(\\\\).-]", ""));
Output:
asdfadfasdf
Also, note that
The String#replaceAll() interprets the argument as a regular expression. The \ is an escape character in both String and regex. You need to double-escape it for regex
P.S. Useful resource to test your regexes: https://regex101.com/
Upvotes: 2
Reputation: 6808
Another way if you want to remove ALL NON-LETTER characters is to re-write the string ignoring all other symbols.
String s = "hello world _!@#";
StringBuilder sb = new StringBuilder();
for (char c : s.toCharArray()) {
if (Character.isDigit(c) || Character.isLetter(c) || Character.isWhitespace(c))
sb.append(c);
}
s = sb.toString();
System.out.println(s);
Upvotes: 1