Reputation: 11
So, regex has been the bane of my existence for some time. I feel that I'm on the cusp of understanding it, but I'm just getting very frustrated. In short:
I'm attempting to scrape data from the following website via PHP:
http://magicseaweed.com/Asbury-Park-Surf-Report/857/
I want to extract the bold wave height at the top of the page (at the moment, it reads 3-5). I understand why this works:
preg_match('/<div class="msw-fct-ccd msw-sr-details span3"> <h3> <span>(.*)
<small>ft<\/small> <\/span> <div class="msw-fct-ccr msw-sr-rating">/', $pageMagic,
$height);
But I don't understand why this will not:
preg_match('/<div class="msw-fct-ccd msw-sr-details span3"> <h3> <span>(/d-/d)|(/d)
<small>ft<\/small> <\/span> <div class="msw-fct-ccr msw-sr-rating">/', $pageMagic,
$height);
In my mind, logically speaking, it should be looking for a digit, a dash, then another digit OR just one digit. I tested out regex in http://gskinner.com/RegExr/ and it picked up 3-5. Thank you in advance!
Upvotes: 1
Views: 62
Reputation: 9211
Your slashes are the wrong way around. It should be:
(\d-\d)|(\d)
Incidentally, you can simplify this to:
\d(-\d)?
...but note that this would change the capture groups. I leave the fix for that as an exercise for you :)
Upvotes: 2