Reputation: 129
I am trying to use Python to scrap dynamic stock information from a Chinese website :
http://vip.stock.finance.sina.com.cn/mkt/#cyb_root
However, I am a newer to Python. So can someone give me a hand for this? Thanks so much.
All I wanna do is scraping the HTML dynamic data from the above website.
Upvotes: 0
Views: 1171
Reputation: 1252
I tried your website and in your case it's simpler to do the crawler with a browser automation tool like selenium, here is working example for sina.com.cn:
from selenium import webdriver
driver = webdriver.Firefox()
driver.get("http://vip.stock.finance.sina.com.cn/mkt/#cyb_root")
rows = driver.find_elements_by_xpath("//div[@id='tbl_wrap']//tbody/tr")
for row in rows:
name = row.find_element_by_xpath("./th[@class='sort_down']/a").text
values = [v.text for v in row.find_elements_by_xpath("./td[@class='colorize']")]
print "%s : %s" % (name, values)
driver.close()
if you run this script, you'll get:
$ python sina_com.py
sz300001 : [u'16.51', u'+0.64', u'+4.033%', u'16.51', u'16.52', u'15.87', u'15.86', u'16.58', u'15.80']
sz300002 : [u'--', u'0.00', u'0.000%', u'0.00', u'0.00', u'19.34', u'0.00', u'0.00', u'0.00']
sz300003 : [u'10.86', u'-0.05', u'-0.458%', u'10.85', u'10.86', u'10.91', u'10.98', u'10.98', u'10.59']
sz300004 : [u'22.86', u'+1.21', u'+5.589%', u'22.86', u'22.87', u'21.65', u'21.74', u'22.98', u'21.10']
sz300005 : [u'10.91', u'+0.35', u'+3.314%', u'10.91', u'10.94', u'10.56', u'10.51', u'10.99', u'10.51']
.....
:) good luck
Upvotes: 1
Reputation: 1252
You can use http://docs.python-requests.org/en/latest/ or http://doc.scrapy.org/en/0.16/.
If you want to follow lots of links or you need to crawl differents types on objects. I highly recommand scrapy. If you need an exemple for your website let me know, both libraries are very simple to use.
Upvotes: 1