Reputation: 607
I need to locate a person's name in the following string:
TI35635: 71-3463463409 wa36ued i56tle Ro356 IL
Involved Subject
Name: PETER SMITH
Address: 1 MAIN AVE
So, the rule I should follow is the following: the substring is whatevet goes right after "Subject \n+ Name:" and before "hits \n" I must follow this rule, because some words in the original String (too long) that I did not post could not be unique
I tried the following:
Pattern patternName = Pattern.compile("(?:Subject.?)(\\n)(Name:.*?)\\n", Pattern.DOTALL);
Matcher matcherName = patternName.matcher(text);
matcherName.find();
Please help me correct it
Upvotes: 1
Views: 197
Reputation: 159874
Just skip the whitespace before attempting to match the group containing the name. You can use \s
which will not only match spaces but also newline characters
Pattern patternName =
Pattern.compile("(?:Subject.?)\\s+(Name:.*?)\\n", Pattern.DOTALL);
Group 1
contains:
Name: PETER SMITH
Read the Pattern javadoc for a full list of characters matched by \s
Upvotes: 1
Reputation: 425418
You can do it in just one line:
String name = str.replaceAll("(?sm).*Subject\\s+Name:(.*?)?$.*", "$1");
If the name is not found, the result will be blank.
I've also made it so it will work on windows files too.
Here's some test code:
String str = " TI35635: 71-3463463409 wa36ued i56tle Ro356 IL\n Involved Subject\n Name: PETER SMITH\n Address: 1 MAIN AVE";
String name = str.replaceAll("(?sm).*Subject\s+Name:(.*?)?$.*", "$1");
System.out.println("Name = " + name);;
Output:
Name = PETER SMITH
Upvotes: 1
Reputation: 5755
You could represent the Regex for the name as:
([ \\t\\x0B\\f\\r]*[a-zA-Z]+)*
This represents a sequence of zero or more of the following: zero or more spaces (non-newlines), followed by one or more letters. Should handle the names within your larger Regex.
Alternatively, \s represents whitespace (although it includes newlines) and \w represents any letter or number character.
Upvotes: 1
Reputation: 77930
Your example has 3 groups a.e O(n^3)
where n is char number.
Generally regex is good if we want to replace multiple times.
In your case regex is too expensive. (from my view). Iwould use followed example:
String str = "TI35635: 71-3463463409 wa36ued i56tle Ro356 IL\r\n" +
" Involved Subject\r\n" +
" Name: PETER SMITH\r\n" +
" Address: 1 MAIN AVE";
StringBuilder buff = new StringBuilder();
for(String line : str.split(System.getProperty("line.separator"))){
if(line.contains("Name: ")){
String temp = line.split(": ")[0];
temp = temp + ": " + "New Name";
buff.append(temp).append("\n");
}
else{
buff.append(line).append("\n");
}
}
System.out.println(buff.toString());
Output:
TI35635: 71-3463463409 wa36ued i56tle Ro356 IL
Involved Subject
Name: New Name
Address: 1 MAIN AVE
Upvotes: 1
Reputation: 42060
You can try the regular expression:
Subject[ ]*\r?\n[ ]*(Name:.*)
e.g.:
private static final Pattern REGEX_PATTERN =
Pattern.compile("Subject[ ]*\\r?\\n[ ]*(Name:.*)");
public static void main(String[] args) {
String input = "TI35635: 71-3463463409 wa36ued i56tle Ro356 IL\n Involved Subject\n Name: PETER SMITH\n Address: 1 MAIN AVE";
Matcher matcher = REGEX_PATTERN.matcher(input);
while (matcher.find(1)) {
System.out.println(matcher.group());
}
}
Output:
Name: PETER SMITH
Upvotes: 1