extracting data with regex

Question

well i got a nice solution here but the regex split the string into "" string and 2 other splits i needed.

String  Result = "Securities regulation in the United States - Securities regulation in the United States is the field of U.S. law that covers transactions and other dealings with securities.";

String [] Arr =  Result.split("<[^>]*>");
for (String elem : Arr) {
    System.out.printf(elem);
}

the result is:

Arr[0]= ""
Arr[1]= Securities regulation in the United States
Arr[2]= Securities regulation in the United States is the field of U.S. law that covers transactions and other dealings with securities.

the Arr[1] and Arr[2] splits are fine I just cant get rid of the Arr[0].

Federico Piazza · Accepted Answer

You can use an opposite regex to capture what you want by using a regex like this:

(?s)(?:^|>)(.*?)(?:<|$)

Working demo

IDEOne Code working

Code:

String line = "ahref=https://blabla.com/Securities_regulation_in_the_United_States>Securities regulation in the United States - Securities regulation in the United States is the field of U.S. law that covers transactions and other dealings with securities.";

Pattern pattern = Pattern.compile("(?s)(?:^|>)(.*?)(?:<|$)");
Matcher matcher = pattern.matcher(line);
while (matcher.find()) {
    System.out.println("group 1: " + matcher.group(1));
}

extracting data with regex

Answers (2)

Related Questions