Reputation: 320
Content of 1.txt:
Image" href="images/product_images/original_images/9961_1.jpg" rel="disable-zoom:false; disable-expand: false"><img src="im
Code that does not work:
<?php
$pattern = '/(images\/product_images\/original_images\/)(.*)(\.jpg)/i';
$result = file_get_contents("1.txt");
preg_match($pattern,$result,$match);
echo "<h3>Preg_match Pattern test:</h3><br><br><pre>";
print_r($match);
echo "</pre>";
?>
I expect this result:
Array
(
[0] => images/product_images/original_images/9961_1.jpg
[1] => images/product_images/original_images/
[2] => 9961_1
[3] => .jpg
)
But i take this-like:
Array
(
[0] => images/product_images/original_images/9961_1.jpg" rel="disable-zoom:false; disable-expand: false">
[1] => images/product_images/original_images/
[2] => 9961_1.jpg" rel="disable-zoom:false; disable-expand: false">
)
I'n tired of trying from a million combinations of this regexp. I dunno what's wrong. Please and thanks a lot!
Upvotes: 0
Views: 232
Reputation: 44056
Do not parse HTML with regex.
Do not parse HTML with regex.
Do not parse HTML with regex.
Upvotes: -1
Reputation: 72971
Remember that Regular Expressions are greedy. Your second capture (.*)
says to match any character except the new line (unless in mutliline mode). So it is probably capturing the rest of the line.
You can make it ungreedy as suggested by Wrikken. But I like to ensure I am capturing what I want. In your case, it looks like the value of the href
attribute. So really I want at least 1 character, can't be a quote, followed by the jpg extension:
$pattern = '/(images\/product_images\/original_images\/)([^'"]+)(\.jpg)/i';
Upvotes: 2
Reputation: 70460
Make it ungreedy:
$pattern = '/(images\/product_images\/original_images\/)(.*?)(\.jpg)/i';
Upvotes: 4