J.E.Y
J.E.Y

Reputation: 1175

Intermitted behavior with encoded ampersand in URLs

I am testing how to use data scrape scripts to grab page from Best Buy's page and I generated a link like this:

http://www.bestbuy.com/site/searchpage.jsp?_dyncharset=ISO-8859-1&_dynSessConf=1803033044744184095&id=pcat17071&type=page&st=DOTD_2012126b&sc=Global&cp=1&nrp=15&sp=&qp=&list=n&iht=y&usc=All+Categories&ks=960&p=[promotion%2C+synonymns]&pu=defaultusr&pt=1354255201

The above link didn't work, I got a sorry, page not accessible error.

However, after replacing those ampersands (&) with "&" manually, it worked.

Another link, which also includes encoded &, worked.

http://www.bestbuy.com/site/PNY+-+32GB+Secure+Digital+High+Capacity+(SDHC)+Class+10+Memory+Card/2300602.p?id=1218318851702&skuId=2300602&st=2300602&cp=1&lp=1

Why does it work in the second case?

Upvotes: 0

Views: 229

Answers (1)

evil otto
evil otto

Reputation: 10582

If the second one works, it's by accident; whatever happens with these links is completely up to the site.

The links should be encoded with & in the html page text, but that's only to allow the & to actually be on the page. The actual URLs should have the literal & only.

There is an addendum to one html standard suggesting that urls should use ; for separating parameters rather than &, because of this encoding problem. The suggestion was pretty much universally ignored (except by CGI.pm, where it annoyed everyone who had to suffer with it)

Upvotes: 1

Related Questions