Reputation: 1058
i was trying to create a regular expressions to extract all MP3/OGG links from a example word but i could't! this is a example word that i'm trying to extract MP3/OGG files from it:
this is a example word http://domain.com/sample.mp3 and second file is https://www.mydomain.com/sample2.ogg. then this is a link for third file <a href="http://seconddomain.com/files/music.mp3" target="_blank">Download</a>
and PHP part:
$Word = "this is a example word http://domain.com/sample.mp3 and second file is https://www.mydomain.com/sample2.ogg. then this is a link for third file <a href="http://seconddomain.com/files/music.mp3" target="_blank">Download</a>";
$Pattern = '/href=\"(.*?)\".mp3/';
preg_match_all($Pattern,$Word,$Matches);
print_r($Matches);
i tried this too:
$Pattern = '/href="([^"]\.mp3|ogg)"/';
$Pattern = '/([-a-z0-9_\/:.]+\.(mp3|ogg))/i';
so i need your help to fix this code and extract all MP3/OGG links from that example word.
Thank you guys.
Upvotes: 1
Views: 1400
Reputation: 800
..extract all MP3/OGG links from that example word.
e.g.:
(?<=https?://(.+)?)\.(mp3|ogg)
Updated:
:( yes, on the PHP (v5.5 tested) search with:
(?<=https?://(.+)?)\.(mp3|ogg)
there are restrictions:
so, the similar variant:
(?<=p1(.+)?)p2
- match p2 if matched p1 beforep2(?=(.+)p3)
- match p2 if matched p3 after - all working with not fixed length ~ .+? for PHPfor your sample:
//p2(?=.*p3)
preg_match_all("#https?://(?=(.+?)\.(mp3|ogg))#im", $Word, $Matches);
/*
[0] => Array
(
[0] => http://
[1] => https://
[2] => http://
)
[1] => Array
(
[0] => domain.com/sample
[1] => www.mydomain.com/sample2
[2] => seconddomain.com/files/music
)
[2] => Array
(
[0] => mp3
[1] => ogg
[2] => mp3
)
*/
Upvotes: 1
Reputation: 3780
To retrieve all links, you can use:
((https?:\/\/)?(\w+?\.)+?(\w+?\/)+\w+?.(mp3|ogg))
Demo.
((https?:\/\/)?
Optional http://
or https://
(\w+?\.)+?
Matches domain groups
(\w+?\/)+
Matches the final domain group and forward slash
\w+?.(mp3|ogg))
Matches a filename ending in .mp3
or .ogg
.
In the string you provided there are several unescaped quotation marks, when corrected and my regex added in:
$Word = "this is a example word http://domain.com/sample.mp3 and second file is https://www.mydomain.com/sample2.ogg. then this is a link for third file <a href=\"http://seconddomain.com/files/music.mp3\" target=\"_blank\">Download</a>";
$Pattern = '/((https?:\/\/)?(\w+?\.)+?(\w+?\/)+\w+?.(mp3|ogg))/im';
preg_match_all($Pattern,$Word,$Matches);
var_dump($Matches[0]);
Produces the following output:
array (size=3)
0 => string 'http://domain.com/sample.mp3' (length=28)
1 => string 'https://www.mydomain.com/sample2.ogg' (length=36)
2 => string 'http://seconddomain.com/files/music.mp3' (length=39)
Upvotes: 1