Reputation: 21
Now my program use this pattern .*[A-Z].*
to match every word that has an uppercase letter. But my problem is, I need a pattern that can identify a word with an uppercase only on the beginning of word.
Example Input for my program now:-
My name is Johan and I am from langKawi.
Output Matched: My Johan I langKawi.
But using my pattern a word like langKawi where the uppercase is not at the beginning of a word it still matched.
Can anyone help me with the pattern where it match a word that has uppercase for the first letter only and my text/input only consist alphabetical characters without number and symbols. Thank you.
Upvotes: 2
Views: 7896
Reputation: 476659
That's why they invented \b
:
\b[A-Z][A-Za-z]*\b
\b
acts as a word boundary: it matches spaces (and other delimiters) or the begin and end of a string.
Example to capture all parts:
import java.util.regex.*;
public class HelloWorld{
public static void main(String []args){
Pattern p = Pattern.compile("\\b([A-Z][a-z]*)\\b");
Matcher m = p.matcher("My name is Johan and I am from langKawi.");
while(m.find()) {
System.out.println(m.group(1));
}
}
}
You can test the code here.
Upvotes: 2
Reputation: 59699
Use a word boundary to match just before a word starts, then the word, then another word boundary:
\b[A-Z]\w*\b
That, in Java, looks like this:
Pattern p = Pattern.compile("\\b([A-Z]\\w*)\\b");
String s = "My name is Johan and I am from langKawi.";
Matcher matcher = p.matcher(s);
while(matcher.find()){
System.out.println(matcher.group(1));
}
This outputs:
My
Johan
I
Upvotes: 4