Reputation: 37
I have a String that is a HTML withoutany kind of close tag (</.*?>
) and without any new line (\n
):
<tr><td align=center>01/01/2001<td align=center>500,01<td align=center>0,99<td align=center>15
This repeat indefinitely and may have 1 or more td's for values.
At the moment I am using String.split("<tr><td align=center>")
to separate the String and then use one regex to find the date and one to find the value I want.
Something like this:
String[] stringArray = text.split("<tr><td align=center>");
String[] array1 = Arrays.copyOfRange(stringArray, stringArray.length - /*0<n<21*/,
stringArray.length);
for (int i = 0; i < array1.length; i++) {
System.out.println(array1[i]);
m1 = Pattern.compile("(\\d{2}\\/\\d{2}\\/\\d{4})").matcher(
array1[i]);
//getting date
m1.find();
System.out.println(m1.group(1));
m1 = Pattern.compile("<td align=center>(\\d+,*\\d*)").matcher(array1[i]);
while (m1.find()) {
System.out.println(m1.group(/*0<n*/));
}
}
I want a way to get a String that is equivalent to array1 (the last n positions of a string) but using regex.
I know I can use a bigger regex with $
at the end to get the last <tr>
, but I want to get all 19 <tr>
before it to.
I don't know if I am being clear here. Let me know if I can provide more details.
PS: yes the values are writen with ',' instead of '.'... I use a replace later on.
Upvotes: 2
Views: 225
Reputation: 88707
With Java regular expressions you can't collect an arbitrary number of matches into a single group, so unless you know the exact/maximum number of groups you'd have to apply the regex multiple times and collect the matches yourself.
Btw, you should check whether m1.find();
returns true before calling m1.group(1);
otherwise you'd get an IllegalStateException if the expression doesn't match.
As another note, I'd compile the date pattern outside the loop, probably in some initialization code.
Upvotes: 1