Reputation: 131
I am looking for sentences of the form. "....X is educated at Y..." in third field of each line of a document of text. X is known and Y is the unknown. On a successful match, how can I get the value of Y? Following is my code:
Pattern p1 = Pattern.compile(".* educated at .*");
int count = 0;
while((line = br.readLine()) != null){
String datavalue[] = line.split("\t");
String text = datavalue[2];
Matcher m = p1.matcher(text);
if(m.matches()){
count++;
//System.out.println(text);
//How do I get Y?
}
}
I'm new to reg-ex. Any help is appreciated.
Upvotes: 0
Views: 47
Reputation: 425033
You can do it in one line:
while((line = br.readLine()) != null){
String y = line.replaceAll(".*?\t.*?\t{^\t]*educated at (\\w+).*|.*", "$1");
The variable y
will be blank if there's no match.
Upvotes: 0
Reputation: 8348
Capture the found text as a group:
Pattern p1 = Pattern.compile(".* educated at (.*)");//note the parenthesis
int count = 0;
while((line = br.readLine()) != null){
String datavalue[] = line.split("\t");
String text = datavalue[2];
Matcher m = p1.matcher(text);
if(m.matches()){
count++;
System.out.println(m.group(1));
}
}
Please see https://docs.oracle.com/javase/tutorial/essential/regex/groups.html for more information
Upvotes: 4