Davide Bonuzzi
Davide Bonuzzi

Reputation: 37

getting substring regex java

I have this kind of line: <a href="/verona/4mktg-for-marketing.8526695" title="4MKTG FOR MARKETING SRL">4MKTG FOR MARKETING <strong>SRL</strong> </a>

I need the field's title. I splitted the string by 'title="' then checked if it matches with this regex: "[0-9A-Z /.]{3,}" . But it doesnt work...

The field contains only digits, capital letters, spaces and dots

Thank you

Davide

Upvotes: 1

Views: 79

Answers (3)

Bram Vanroy
Bram Vanroy

Reputation: 28554

If you need to do it with regex (and using java.util.regex, see this answer considering PERL-like regexes in Java):

str = '<a href="/verona/4mktg-for-marketing.8526695" title="4MKTG FOR MARKETING SRL">4MKTG FOR MARKETING <strong>SRL</strong> </a>';
str = str.replaceAll('.* title="([\s\.A-Z0-9]+)".*', "$1");

Upvotes: 2

hwnd
hwnd

Reputation: 70750

Instead of using a regular expression, you should use JSoup when dealing with HTML.

Document doc = Jsoup.parse(html);
Element links = doc.select("a");
for (Element l : links) {
    // grab the title attribute value
    System.out.println(l.attr("title"));
}

Upvotes: 3

abc123
abc123

Reputation: 18853

title="([\dA-Z\. ]+)"

Regular expression visualization

Debuggex Demo

Upvotes: 2

Related Questions