Reputation: 1442
I'm trying to scrape an image url from twitter e.g. 'https://pbs.twimg.com/media/BGZHCHwCEAACJ19.jpg:large' using php. I have found the following php code and file_get_contents is working but I don't think the regurlar expression is matching the url. Can you help debug this code? Thanks in advance.
Here is a snippet from twitter which contains the image:
<div class="media-gallery-image-wrapper">
<img class="large media-slideshow-image" alt="" src="https://pbs.twimg.com/media/BGZHCHwCEAACJ19.jpg:large" height="480" width="358">
</div>
Here is the php code:
<?php
$url = 'http://t.co/s54fJgrzrG';
$twitter_page = file_get_contents($url);
preg_match('/(http:\/\/p.twimg.com\/[^:]+):/i', $twitter_page, $matches);
$imgURL = array_pop($matches);
echo $imgURL;
?>
Upvotes: 2
Views: 1317
Reputation: 7880
Something like this should provide a URL.
<?php
$url = 'http://t.co/s54fJgrzrG';
$twitter_page = file_get_contents($url);
preg_match_all('!http[s]?:\/\/pbs\.twimg\.com\/[^:]+\.(jpg|png|gif)!i', $twitter_page,$matches);
echo $img_url=$matches[0][0];
?>
Response is
https://pbs.twimg.com/media/BGZHCHwCEAACJ19.jpg
Upvotes: 1
Reputation: 10104
It appears that your regular expression is missing part of the beginning of the URI. It was missing the 'pbs' part, and was not able to determine if http or https.
preg_match('/((http|https):\/\/pbs.twimg.com\/[^:]+):/i', $twitter_page, $matches);
Upvotes: 1