Reputation: 27
(in java) I want to create a function to extract parts of a string using regular expressions:
public HashMap<Integer,String> extract(String sentence, String expression){
}
//I need to send a sentence like this for example:
HashMap<Integer,String> parts =extract("hello Jhon how are you", "(hello|hi) @1 how are @2");
// the expression validates: the sentence must start with hello or hi, next a word or group of words, next the words: "how are" and next other words extra // And I want to get this:
parts.get(1) --> "Jhon"
parts.get(2) --> "you"
//but this function return null if I give this:
extract("any other words","hello @1 how are @2");
I was doing it without regular expressions but the code became a little large and I'm not sure if it would be better use regular expressions to get a faster process and how could i do it with regular expressions.
Upvotes: 0
Views: 140
Reputation: 11075
Thanks for @ajb 's comment. I've modified my question to meet Omar's requirement. It's more complicated than what I think, lol.
I assume Omar wants to use regular expression he provided to capture specific word. He uses @1, @2 ... @n to represent what he wants to capture and the integer value is also the key to retrieve the target from a map.
Edit, the OP wants to put the @n into parenthese, I will preprocess the expression to replace "(" with "(?:". If this is the case, the group will still take effect but not for capture.
import java.util.ArrayList;
import java.util.HashMap;
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class Test {
public static void main(String args[]){
Test test = new Test();
String sentence1 = "whats the number of apple";
String expression1 = "whats the (number of @1|@1s number)";
HashMap<Integer, String> map1 = test.extract(sentence1, expression1);
System.out.println(map1);
String sentence2 = "whats the bananas number";
HashMap<Integer, String> map2 = test.extract(sentence2, expression1);
System.out.println(map2);
String sentence3 = "hello Jhon how are you";
String expression3 = "(hello|hi) @1 how are @2";
HashMap<Integer, String> map3 = test.extract(sentence3, expression3);
System.out.println(map3);
}
public HashMap<Integer,String> extract(String sentence, String expression){
expression = expression.replaceAll("\\(", "\\(?:");
ArrayList<Integer> keys = new ArrayList<Integer>();
String regex4Expression = "@([\\d]*)";
Pattern pattern4Expression = Pattern.compile(regex4Expression);
Matcher matcher4Expression = pattern4Expression.matcher(expression);
while(matcher4Expression.find()){
for(int i = 1; i <= matcher4Expression.groupCount(); i++){
if(!keys.contains(Integer.valueOf(matcher4Expression.group(i)))){
keys.add(Integer.valueOf(matcher4Expression.group(i)));
}
}
}
String regex = expression.replaceAll("@[\\d]*", "([\\\\w]*)");
HashMap<Integer, String> map = new HashMap<Integer, String>();
Pattern pattern = Pattern.compile(regex);
Matcher matcher = pattern.matcher(sentence);
while(matcher.find()){
ArrayList<String> targets = new ArrayList<String>();
for(int i = 1; i <= matcher.groupCount(); i++){
if(matcher.group(i) != null){
targets.add(matcher.group(i));
}
}
for(int j = 0; j < keys.size(); j++){
map.put(j + 1, targets.get(j));
}
}
return map;
}
}
The result is as below
{1=apple}
{1=banana}
{1=Jhon, 2=you}
Upvotes: 1