Reputation: 320

Why is this regular expression not working?

Content of 1.txt:

Image" href="images/product_images/original_images/9961_1.jpg" rel="disable-zoom:false; disable-expand: false"><img src="im

Code that does not work:

<?php
$pattern = '/(images\/product_images\/original_images\/)(.*)(\.jpg)/i';
$result = file_get_contents("1.txt");
preg_match($pattern,$result,$match);

echo "<h3>Preg_match Pattern test:</h3><br><br><pre>";
print_r($match);
echo "</pre>";
?>

I expect this result:

Array
(
    [0] => images/product_images/original_images/9961_1.jpg
    [1] => images/product_images/original_images/
    [2] => 9961_1
    [3] => .jpg
)

But i take this-like:

Array
(
    [0] => images/product_images/original_images/9961_1.jpg" rel="disable-zoom:false; disable-expand: false"> 
    [1] => images/product_images/original_images/
    [2] => 9961_1.jpg" rel="disable-zoom:false; disable-expand: false"> 
)

I'n tired of trying from a million combinations of this regexp. I dunno what's wrong. Please and thanks a lot!

Upvotes: 0

Answers (4)

Randal Schwartz

Reputation: 44056

Do not parse HTML with regex.

Upvotes: -1

StackOverflowNewbie

Reputation: 40633

Here's the basic regex:

href="((.*/)(.*?)(.jpg))"

Upvotes: 0

Jason McCreary

Reputation: 72971

Remember that Regular Expressions are greedy. Your second capture (.*) says to match any character except the new line (unless in mutliline mode). So it is probably capturing the rest of the line.

You can make it ungreedy as suggested by Wrikken. But I like to ensure I am capturing what I want. In your case, it looks like the value of the href attribute. So really I want at least 1 character, can't be a quote, followed by the jpg extension:

$pattern = '/(images\/product_images\/original_images\/)([^'"]+)(\.jpg)/i';

Upvotes: 2

Wrikken

Reputation: 70460

Make it ungreedy:

$pattern = '/(images\/product_images\/original_images\/)(.*?)(\.jpg)/i';

Upvotes: 4

Why is this regular expression not working?

Answers (4)

Related Questions