vr22
vr22

Reputation: 57

extract price from different e-commerce site using python

I need to develop web app for extracting prices of books from different e-commerce sites like amazon,homeshop18 when user enters book name in the interface and displays all the information.

My questions are 1)how to pass that query to amazon site search box and i can get only the pages relevant to the query instead of crawling the whole site.

2)What can be used to develop this application?BeautifulSoup or scrappy?API's are not available for all e-commerce sites to use it

am new to python.so any help will be highly appreciated

Upvotes: 0

Views: 1896

Answers (1)

Scout
Scout

Reputation: 575

I personnaly use BeautifulSoup to parse web pages, but beware it's a bit slow if you have to parse pages massively. I know that lxml is faster but a bit less coder-friendly.To guess the right parameters (either for an HTTP GET or POST) for getting the result page you want, you should proceed like this:

  1. Switch on the firebug plugin for Firefox or the integrated inspector for Chrome
  2. Go on the web page you're interested in, and do the search
  3. Go into firebug/inspector to see the parameters of the HTTP request Firefox or Chrome sent to the website.
  4. Reproduce the request in your python script. For example using urllib

There is another way to guess the right HTTP GET or POST parameters, it's to use a network analyzer like Wireshark. This is a more detailed approach but feels more like finding a needle in a haystack once you used the tools in Firefox/Chrome.

Upvotes: 1

Related Questions