Reputation: 33
I have a String that begins with a word and I want to make a substring which starts at index 0 and ends at the index of the next special character (space
, .
, !
, ?
, etc...). How would I go about doing that with a regex? Can I get the index of the first regex match? And how would the pattern look?
Thanks in advance!
Upvotes: 0
Views: 4191
Reputation: 70750
You could use the following.
^\w+(?=\W)
Explanation:
^ # the beginning of the string
\w+ # word characters (a-z, A-Z, 0-9, _) (1 or more times)
(?= # look ahead to see if there is:
\W # non-word characters (all but a-z, A-Z, 0-9, _)
) # end of look-ahead
Example:
String s = "foobar!";
Pattern p = Pattern.compile("^\\w+(?=\\W)");
Matcher m = p.matcher(s);
if (m.find()) {
System.out.println("Start:" + m.start() + " End:" + m.end());
System.out.println(m.group());
}
Upvotes: 2
Reputation: 72884
The following prints the substring that contains the word part in your string (a \w
denotes a word characters including digits, while \W
denotes a non-word character):
Pattern p = Pattern.compile("(\\w+)[\\W\\s]*");
Matcher matcher = p.matcher("word!,(. [&]");
if(matcher.find()) {
System.out.println(matcher.group(1));
}
Output: word
Upvotes: 1
Reputation: 129572
How would I go about doing that with a regex?
You can try something like this:
^.*?\p{Punct}
^
matches start of string.*?
matches anything reluctantly\p{Punct}
matches one of !"#$%&'()*+,-./:;<=>?@[]^_`{|}~
Can I get the index of the first regex match?
In general, you can obtain the indices of regex matches with Matcher#start
.
Upvotes: 1