Reputation: 323
I am trying to extract text between semi colon (;) and WORD. i am using below code but unable to extract "TVS A3003" using below code.
Matcher matcher = Pattern.compile("(?<=;).*?(?=WORD)").matcher(string);
Three Sample strings :
1. (XYZTRR: KTTT 4.0.1; TVS A3003 WORD/LLLLL ; pj ;)
2. (XcdcdRR; dTff 5.4.1; TVS A3003 WORD/UJH;KKKHH fpp)
3. LLLhf22; 776332 8.7.1; TVS A3003 WORD/UHHGFVV phhp
4. (;LLLhf22; 776332 8.7.1; TVS A3003 WORD/UHHGFVV phhp ;)
I want to extract TVS A3003 in all the cases.
Upvotes: 1
Views: 72
Reputation: 627101
You need to find a ;
and then match any 0+ chars other than ;
as few as possible up to the first occurrence of WORD
. You may do that using
;([^;]*?)WORD
See the regex demo. Note that the leading/trailing whitespace can be easily trimmed off with .trim()
after a match is found.
See the Java demo below:
List<String> strs = Arrays.asList("(XYZTRR: KTTT 4.0.1; TVS A3003 WORD/LLLLL ; pj ;)",
"(XcdcdRR: dTff 5.4.1; TVS A3003 WORD/UJHKKKHH fpp)",
"(LLLhf22; 776332 8.7.1; TVS A3003 WORD/UHHGFVV phhp) );");
Pattern pattern = Pattern.compile(";([^;]*?)WORD");
while (String s : strs) {
Matcher matcher = pattern.matcher(s);
if (matcher.find()){
System.out.println(matcher.group(1).trim());
}
}
Output:
TVS A3003
TVS A3003
TVS A3003
Upvotes: 1
Reputation: 2028
reg ex is (?<=KTTT 4\.0\.1; )(.*)(?= WORD/U)
Matcher matcher = Pattern.compile("(?<=KTTT 4\\.0\\.1; )(.*)(?= WORD/U)").matcher(string);
if(matcher.find()){
System.out.println(matcher.group());
}
Upvotes: 0