Reputation: 13
have excel data and formulars like:
and so on.
I want to get all the cell references out of the strings. Like in this Example: "E10+E11+SUM(E10;E14:E17)*E18-IF(E19<1,E20, E21)"
I want the output substring like "E10 E11 ... E21" or separated with ",".
I tested a lot with regex but can't get a valid result. I am using this code:
String formulaString = "E10+E11+SUM(E10;E14:E17)*E18-IF(E19<1,E20, E21)";
Pattern pattern = Pattern.compile("REGEX");
Matcher matcher = pattern.matcher(formulaString);
I have tried the following regex:
REGEX: (\w+|)?\$?(?:\bXF[A-D]|X[A-E][A-Z]|[A-W][A-Z]{2}|[A-Z]{2}|[A-Z])\$?(?:104857[0-6]|10485[0-6]\d|1048[0-4]\d{2}|104[0-7]\d{3}|10[0-3]\d{4}|[1-9]\d{1,5}|[1-9])d?\b(:\s?\$?(?:\bXF[A-D]|X[A-E][A-Z]|[A-W][A-Z]{2}|[A-Z]{2}|[A-Z])\$?(?:104857[0-6]|10485[0-6]\d|1048[0-4]\d{2}|104[0-7]\d{3}|10[0-3]\d{4}|[1-9]\d{1,5}|[1-9])d?\b)?
REGEX: \$?\b([A-Z]|[A-H][A-Z]|I[A-V])\$?([1-9]\d{0,3}|[1-5]\d{4}|6[0-4]\d{3}|65[0-4]\d{2}|655[0-2]\d|6553[0-6])\b([:\s]\$?\b([A-Z]|[A-H][A-Z]|I[A-V])\$?([1-9]\d{0,3}|[1-5]\d{4}|6[0-4]\d{3}|65[0-4]\d{2}|655[0-2]\d|6553[0-6])\b)?
For some of the formular they are working, but not for all.
I hope anyone can help me or give me a tip :)
Upvotes: 0
Views: 1245
Reputation: 2189
public static void main(String[]args){
String formula = "E10+E11+SUM(E10;E14:E17)*E18-IF(E19<1,E20, E21)";
String output="";
for(String c: formula.split("[^A-z0-9]+")){
if(isCell(c)){
output+=c+" ";
}
}
}
private static boolean isCell(String current){
boolean hasLetter = false;
boolean hasNumber = false;
for(int i=0; i<current.length() && (!hasLetter || !hasNumber); i++){
if(current.charAt(i)>=65 && current.charAt(i)<=90){
hasLetter=true;
}
else if(current.charAt(i)>='0' && current.charAt(i)<='9'){
hasNumber=true;
}
}
return hasLetter && hasNumber;
}
Upvotes: 0
Reputation: 353
On the first thread you linked to, the regex you put below the link is nowhere to be found on the page. Were you really using a regex from that page? The regex that was suggested was:
(\w+|)?\$?(?:\bXF[A-D]|X[A-E][A-Z]|[A-W][A-Z]{2}|[A-Z]{2}|[A-Z])\$?(?:104857[0-6]|10485[0-6]\d|1048[0-4]\d{2}|104[0-7]\d{3}|10[0-3]\d{4}|[1-9]\d{1,5}|[1-9])d?\b(:\s?\$?(?:\bXF[A-D]|X[A-E][A-Z]|[A-W][A-Z]{2}|[A-Z]{2}|[A-Z])\$?(?:104857[0-6]|10485[0-6]\d|1048[0-4]\d{2}|104[0-7]\d{3}|10[0-3]\d{4}|[1-9]\d{1,5}|[1-9])d?\b)?
Try that.
Also, it would help to know what specific strings don't get matched correctly, since you mentioned that some of them work.
Upvotes: 0