slothfulwave612
slothfulwave612

Reputation: 1409

Selenium: Get variable data from <script type> tag in Python

So I am trying my hand on selenium to scrape data from a website, as still new to selenium and web scraping I am stuck. I want to scrape some data which is present under <script type> tag, the tag looks like this:

...
...

<script type="text/javascript">
  var myData_1 = {"name" : ..... };
  var myData_2 = {......};
  var myData_id = 4565843;
  var myData_mapping = {.....};
</script>

...
...

So I need to scrape data present in this script tag i.e. all var data values. Till now I have coded down only this much:

from selenium import webdriver
import pandas as pd

driver = webdriver.Chrome('/home/slothfulwave612/chromedriver_linux64/chromedriver')

driver.get('https://www.example.com') ## not the actual site

html = driver.page_source

print(html)

driver.close()

This is just printing the source code for the website, what should I add here so that I can scrape the data from <script type tag. Can somebody help?

Upvotes: 4

Views: 2818

Answers (2)

asantz96
asantz96

Reputation: 619

Use the method .find_element_by_xpath() (docs)

script_label = driver.find_element_by_xpath("//script[@type = 'text/javascript']")

And then you can scrape the inner elements.

Upvotes: 3

rahul rai
rahul rai

Reputation: 2326

If you want to print all the content of script tag use innerHTML attribute to print.

ele = driver.find_element_by_xpath("//script[@type = 'text/javascript']")
print(ele.get_attribute("innerHTML"))

Output:

var myData_1 = {"name" : ..... };

var myData_2 = {......};

var myData_id = 4565843;

var myData_mapping = {.....};

Upvotes: 2

Related Questions