Jaxox
Jaxox

Reputation: 968

Regx + Java : split a text into words and removing punctuation only if they are alone or at the end

I am trying to split a string into words but I want to keep, "a.b.c" as a word, and remove the punctuation only if it is alone or at the end of a word e.g.

"a.b.c" --> "a.b.c"
"a.b."  --> "a.b"

e.g

String str1 = "abc a.b a. .  b, , test"; should return "abc","a.b","a","b","test"

Upvotes: 1

Views: 325

Answers (1)

anubhava
anubhava

Reputation: 785471

You can use:

String str1 = "abc a.b a. .  b, , test";
String[] toks = str1.split("\\p{Punct}*\\s+[\\s\\p{Punct}]*");
for (String tok: toks)
    System.out.printf(">>> [%s]%n", tok);

>>> [abc]
>>> [a.b]
>>> [a]
>>> [b]
>>> [test]

Upvotes: 1

Related Questions