Mohit Agrawal
Mohit Agrawal

Reputation: 323

Extract a sub string between : and WORD in java using regex in java

I am trying to extract text between semi colon (;) and WORD. i am using below code but unable to extract "TVS A3003" using below code.

Matcher matcher = Pattern.compile("(?<=;).*?(?=WORD)").matcher(string);

Three Sample strings :

1. (XYZTRR: KTTT 4.0.1; TVS A3003 WORD/LLLLL ; pj ;) 

2. (XcdcdRR; dTff 5.4.1; TVS A3003 WORD/UJH;KKKHH fpp) 

3. LLLhf22; 776332 8.7.1; TVS A3003 WORD/UHHGFVV phhp

4. (;LLLhf22; 776332 8.7.1; TVS A3003 WORD/UHHGFVV phhp ;)

I want to extract TVS A3003 in all the cases.

Upvotes: 1

Views: 72

Answers (2)

Wiktor Stribiżew
Wiktor Stribiżew

Reputation: 627101

You need to find a ; and then match any 0+ chars other than ; as few as possible up to the first occurrence of WORD. You may do that using

;([^;]*?)WORD

See the regex demo. Note that the leading/trailing whitespace can be easily trimmed off with .trim() after a match is found.

See the Java demo below:

List<String> strs = Arrays.asList("(XYZTRR: KTTT 4.0.1; TVS A3003 WORD/LLLLL ; pj ;)", 
        "(XcdcdRR: dTff 5.4.1; TVS A3003 WORD/UJHKKKHH fpp)",
        "(LLLhf22; 776332 8.7.1; TVS A3003 WORD/UHHGFVV phhp) );");
Pattern pattern = Pattern.compile(";([^;]*?)WORD");
while (String s : strs) {
    Matcher matcher = pattern.matcher(s);
    if (matcher.find()){
        System.out.println(matcher.group(1).trim()); 
    } 
}

Output:

TVS A3003
TVS A3003
TVS A3003

Upvotes: 1

sriam980980
sriam980980

Reputation: 2028

reg ex is (?<=KTTT 4\.0\.1; )(.*)(?= WORD/U)

Matcher matcher = Pattern.compile("(?<=KTTT 4\\.0\\.1; )(.*)(?= WORD/U)").matcher(string);

if(matcher.find()){
     System.out.println(matcher.group());
}

Upvotes: 0

Related Questions