mushfiq
mushfiq

Reputation: 1600

python scraping by getting urls dynamic way

I am new to the world of data scraping,previously used python for web and desktop app development. I am just wondering,if there is any way to get the urls from a page then look into it for specific information like,phone no,address etc.

Currently I am using BeautifulSoup and built method where I am telling the urls as a parameter of the methods.

The site I am scraping large and its really tough to pass the specific url for each page.

Any suggestion to make it faster and self driven?

Thanks in advance.

Upvotes: 2

Views: 561

Answers (2)

hoju
hoju

Reputation: 29452

Use a more efficient HTML parser, like lxml. See here for performance comparisons of various Python parsers.

Upvotes: 0

wRAR
wRAR

Reputation: 25693

You can use Scrapy. It simplifies both crawling and parsing (it uses libxml2 for parsing by default).

Upvotes: 3

Related Questions