How to programmatically access JavaScript variables in a website via Python

Question

I don't even know if this is possible, but I'm hoping there's a way to automate gathering data that are held in a JavaScript object via Python. As an example, I'm trying to access the chart data from http://cryptocurrencychart.com/top/10.

I thought the easiest way to do this would be via the requests module, and just look for the SVG elements that hold the data, such as dom.select('.c3-chart-lines .c3-chart-line .c3-shapes-Bitcoin circle'), where dom is the resulting object from a call to BeautifulSoup, and then use .get('cy') to get the values. However, if you compare the values of the cy attributes to the actual values on the chart, they don't line up.

I realized, however, that I could just open the developer console and access the data via console.log(CryptoCurrencyChart.chart.data());. In order to save these data to a text file, I had to create a link on the webpage, with the base-64 encoded data as the href, and then manually click the link.

My question is whether or not this can be done programmatically via something like Python so that I can automate it for future grabs of the data.

Jedi · Accepted Answer

You can use Selenium to get the CryptoCurrencyChart.chart.data() object

#!/usr/bin/env python

from selenium import webdriver

link = 'http://cryptocurrencychart.com/top/10'

class Scraper(object):
    def __init__(self):
        options = webdriver.ChromeOptions()
        options.add_argument('headless')
        options.binary_location = '/usr/bin/google-chrome-unstable'
        options.add_argument('window-size=1200x600')
        self.driver = webdriver.Chrome(chrome_options=options)

    def scrape(self):
        self.driver.get(link)
        result = self.driver.execute_script('return CryptoCurrencyChart.chart.data()')
        self.driver.quit()
        return result

if __name__ == '__main__':
    scraper = Scraper()
    scraper.scrape()

Running self.driver.execute_script('return CryptoCurrencyChart.chart.data()') will give you 3 arrays with 360 elements each.

How to programmatically access JavaScript variables in a website via Python

Answers (1)

Related Questions