thewhitetie
thewhitetie

Reputation: 313

How to make a regex expression for this content?

I am scraping a website which looks like this, and I am looking for 4 / 5 and 3 / 10. That is, I want (number) + space + slash + 3 spaces + another number.

I tried the regex expression ^[0-9]+(\/[0-9]+)" *"*$ but that did not work.

<td>Monday</td>
<td class="text-center text-danger font-weight-bold">4 /  5</td>
</td>
<td>Tuesday</td>
<td class="text-center text-danger font-weight-bold">3 /  10</td>
</td>

Upvotes: 0

Views: 52

Answers (2)

Peter Thoeny
Peter Thoeny

Reputation: 7616

You were close. Use word boundary \b instead of ^ and $, because the text you are looking for is somewhere in the middle of your text. This regex should work:

/\b[0-9]+ +\/ +[0-9]+\b/

The + makes the regex more forgiving, by requiring at least one space.

If you want to capture the numbers separately you can introduce capture groups, to reference them with $1 and $2, respectively:

/\b([0-9]+) +\/ +([0-9]+)\b/

Upvotes: 1

unxnut
unxnut

Reputation: 8839

$ grep "[[:digit:]]\{1,\} \/  [[:digit:]]\{1,\}" filename
<td class="text-center text-danger font-weight-bold">4 /  5</td>
<td class="text-center text-danger font-weight-bold">3 /  10</td>
$

You have started the regex with ^ which is to anchor the regex to beginning of line.

Upvotes: 0

Related Questions