Reputation: 14849

RegEx : Extract Number out of Source Code

i am no RegEx expert. I need to extract a certain number out of an HTML table.
An example:

<td>13</td><td>
  </td><td align="right">29.543</td>
  <td align="right">1.777</td>
  <td align="right">2.588</td>
</tr><tr><td><a href="player.php?p=84668" >Caterdamus</a></td>
  <td>7</td><td>
  Meister</td><td align="right">9.874</td>
  <td align="right">1.716</td>
  <td align="right">5.791</td>
</tr><tr><td><a href="player.php?p=87216" >grappa</a></td>
  <td>2</td><td>
  </td><td align="right">1.044</td>
  <td align="right">21</td>
  <td align="right">146</td>
</tr></table>

The pattern looks like this :

<td>13</td><td>
<td>7</td><td>
<td>2</td><td>

How do i extract the numbers out of the text and store it into a variable. Hint: the numbers are positive integers.

Thanks:)

Upvotes: 1

Answers (3)

Thomas Owens

Reputation: 116187

I wouldn't use regular expressions to parse HTML or XML. Instead, I would load the document into an HTML DOM parser - you can find several open source ones here. I can't vouch for any of these - I've never worked with anything other than XML in Java.

Upvotes: 8