Dominik
Dominik

Reputation: 4768

Regex string modifications

I have the following String and I want to filter the MBRB1045T4G out with a regular expression in Java. How would I achieve that?

String:

<p class="ref">
<b>Mfr Part#:</b>
MBRB1045T4G<br>


<b>Technologie:</b>&nbsp;
    Tab Mount<br>



<b>Bauform:</b>&nbsp;
    D2PAK-3<br>



<b>Verpackungsart:</b>&nbsp;
    REEL<br>



<b>Standard Verpackungseinheit:</b>&nbsp;
    800<br>

Upvotes: 2

Views: 92

Answers (1)

alexg
alexg

Reputation: 3045

As Wrikken correctly says, HTML can't be parsed correctly by regex in the general case. However it seems you're looking at an actual website and want to scrape some contents. In that case, assuming space elements and formatting in the HTML code don't change, you can use a regex like this:

 Mfr Part#:</b>([^<]+)<br>

And collect the first capture group like so (where string is your HTML):

Pattern pt = Pattern.compile("Mfr Part#:</b>\s+([^<]+)<br>",Pattern.MULTILINE);
Matcher m = pt.matcher(string); 
if (m.matches())
    System.out.println(m.group(1)); 

Upvotes: 3

Related Questions