Reputation: 1175
I am testing how to use data scrape scripts to grab page from Best Buy's page and I generated a link like this:
http://www.bestbuy.com/site/searchpage.jsp?_dyncharset=ISO-8859-1&_dynSessConf=1803033044744184095&id=pcat17071&type=page&st=DOTD_2012126b&sc=Global&cp=1&nrp=15&sp=&qp=&list=n&iht=y&usc=All+Categories&ks=960&p=[promotion%2C+synonymns]&pu=defaultusr&pt=1354255201
The above link didn't work, I got a sorry, page not accessible error
.
However, after replacing those ampersands (&) with "&" manually, it worked.
Another link, which also includes encoded &
, worked.
http://www.bestbuy.com/site/PNY+-+32GB+Secure+Digital+High+Capacity+(SDHC)+Class+10+Memory+Card/2300602.p?id=1218318851702&skuId=2300602&st=2300602&cp=1&lp=1
Why does it work in the second case?
Upvotes: 0
Views: 229
Reputation: 10582
If the second one works, it's by accident; whatever happens with these links is completely up to the site.
The links should be encoded with &
in the html page text, but that's only to allow the &
to actually be on the page. The actual URLs should have the literal &
only.
There is an addendum to one html standard suggesting that urls should use ;
for separating parameters rather than &
, because of this encoding problem. The suggestion was pretty much universally ignored (except by CGI.pm, where it annoyed everyone who had to suffer with it)
Upvotes: 1