Matt Bokey
Matt Bokey

Reputation: 11

What am I missing in my RegEx expression?

So, regex has been the bane of my existence for some time. I feel that I'm on the cusp of understanding it, but I'm just getting very frustrated. In short:

I'm attempting to scrape data from the following website via PHP:

http://magicseaweed.com/Asbury-Park-Surf-Report/857/

I want to extract the bold wave height at the top of the page (at the moment, it reads 3-5). I understand why this works:

preg_match('/<div class="msw-fct-ccd msw-sr-details span3"> <h3> <span>(.*)    
<small>ft<\/small>   <\/span> <div class="msw-fct-ccr msw-sr-rating">/', $pageMagic,
$height);

But I don't understand why this will not:

preg_match('/<div class="msw-fct-ccd msw-sr-details span3"> <h3> <span>(/d-/d)|(/d)    
<small>ft<\/small>   <\/span> <div class="msw-fct-ccr msw-sr-rating">/', $pageMagic,
$height);

In my mind, logically speaking, it should be looking for a digit, a dash, then another digit OR just one digit. I tested out regex in http://gskinner.com/RegExr/ and it picked up 3-5. Thank you in advance!

Upvotes: 1

Views: 62

Answers (1)

Xophmeister
Xophmeister

Reputation: 9211

Your slashes are the wrong way around. It should be:

(\d-\d)|(\d)

Incidentally, you can simplify this to:

\d(-\d)?

...but note that this would change the capture groups. I leave the fix for that as an exercise for you :)

Upvotes: 2

Related Questions