Reputation: 6305
I got the following URL
http://www.amazon.com/LEGO-Ultimate-Building-Set-Pieces/dp/B000NO9GT4/ref=sr_1_1?m=ATVPDKIKX0DER&s=toys-and-games&ie=UTF8&qid=1350518571&sr=1-1&keywords=lego
and I want to extract
B000NO9GT4
that is the asin...to now, I can get search between the string, but not in this way I require. I saw the split functin, I saw the explode. but cant find a way out...also, the urls will be different in length so I cant hardcode the length two..the only thing which make some sense in my mind is to split the string so that
http://www.amazon.com/LEGO-Ultimate-Building-Set-Pieces/dp/
become first part
and
B000NO9GT4/ref=sr_1_1?m=ATVPDKIKX0DER&s=toys-and-games&ie=UTF8&qid=1350518571&sr=1-1&keywords=lego
becomes the 2nd part , from the second part , I should extract B000NO9GT4
in the same way, i would want to get product name LEGO-Ultimate-Building-Set-Pieces
from the first part
I am very bad at regex and cant find a way out..
can somebody guide me how I can do it in php?
thanks
Upvotes: 1
Views: 1497
Reputation: 69937
This grabs both pieces of information that you are looking to capture:
$url = 'http://www.amazon.com/LEGO-Ultimate-Building-Set-Pieces/dp/B000NO9GT4/ref=sr_1_1?m=ATVPDKIKX0DER&s=toys-and-games&ie=UTF8&qid=1350518571&sr=1-1&keywords=lego';
$path = parse_url($url, PHP_URL_PATH);
if (preg_match('#^/([^/]+)/dp/([^/]+)/#i', $path, $matches)) {
echo "Description = {$matches[1]}<br />"
."ASIN = {$matches[2]}<br />";
}
Output:
Description = LEGO-Ultimate-Building-Set-Pieces
ASIN = B000NO9GT4
Short Explanation:
( )
will be saved as a capture group. This is how we get at the data in $matches[1]
and $matches[2]
.([^/]+)
says to match all characters EXCEPT /
so in effect it captures everything in the URL between the two /
separators. I use this pattern twice. The [ ]
actually defines the character class which was /
, the ^
in this case negates it so instead of matching /
it matches everything BUT /
. Another example is [a-f0-9]
which would say to match the characters a,b,c,d,e,f
and the numbers 0,1,2,3,4,5,6,7,8,9
. [^a-f0-9]
would be the opposite.#
is used as the delimiter for the expression^
following the delimiter means match from the beginning of the string.See www.regular-expressions.info and PCRE Pattern Syntax for more info on how regexps work.
Upvotes: 2
Reputation: 95101
You can try
$str = "http://www.amazon.com/LEGO-Ultimate-Building-Set-Pieces/dp/B000NO9GT4/ref=sr_1_1?m=ATVPDKIKX0DER&s=toys-and-games&ie=UTF8&qid=1350518571&sr=1-1&keywords=lego" ;
list(,$desc,,$num,) = explode("/",parse_url($str,PHP_URL_PATH));
var_dump($desc,$num);
Output
string 'LEGO-Ultimate-Building-Set-Pieces' (length=33)
string 'B000NO9GT4' (length=10)
Upvotes: 2