Reputation: 6778
I don't even know if this is possible, but I'm hoping there's a way to automate gathering data that are held in a JavaScript object via Python. As an example, I'm trying to access the chart data from http://cryptocurrencychart.com/top/10.
I thought the easiest way to do this would be via the requests
module, and just look for the SVG elements that hold the data, such as dom.select('.c3-chart-lines .c3-chart-line .c3-shapes-Bitcoin circle')
, where dom
is the resulting object from a call to BeautifulSoup
, and then use .get('cy')
to get the values. However, if you compare the values of the cy
attributes to the actual values on the chart, they don't line up.
I realized, however, that I could just open the developer console and access the data via console.log(CryptoCurrencyChart.chart.data());
. In order to save these data to a text file, I had to create a link on the webpage, with the base-64 encoded data as the href, and then manually click the link.
My question is whether or not this can be done programmatically via something like Python so that I can automate it for future grabs of the data.
Upvotes: 1
Views: 530
Reputation: 3348
You can use Selenium to get the CryptoCurrencyChart.chart.data()
object
#!/usr/bin/env python
from selenium import webdriver
link = 'http://cryptocurrencychart.com/top/10'
class Scraper(object):
def __init__(self):
options = webdriver.ChromeOptions()
options.add_argument('headless')
options.binary_location = '/usr/bin/google-chrome-unstable'
options.add_argument('window-size=1200x600')
self.driver = webdriver.Chrome(chrome_options=options)
def scrape(self):
self.driver.get(link)
result = self.driver.execute_script('return CryptoCurrencyChart.chart.data()')
self.driver.quit()
return result
if __name__ == '__main__':
scraper = Scraper()
scraper.scrape()
Running self.driver.execute_script('return CryptoCurrencyChart.chart.data()')
will give you 3 arrays with 360 elements each.
Upvotes: 2