Reputation: 610
It is my first attempt for scraping with selenium.
I collected what I want but I want to pass it to pandas dataframe in order to make some calculations.
below sample code is how I get the data;
(it is a financial data and [2] and [3] represents years(2016,2017) respectively)
nf1 = driver.find_element_by_xpath('//*[@id="tbodyMTablo"]/tr[84]/td[2]').text
nf2 = driver.find_element_by_xpath('//*[@id="tbodyMTablo"]/tr[84]/td[3]').text
do_v1 = driver.find_element_by_xpath('//*[@id="tbodyMTablo"]/tr[2]/td[2]').text
do_v2 = driver.find_element_by_xpath('//*[@id="tbodyMTablo"]/tr[2]/td[3]').text
kvb_1 = driver.find_element_by_xpath('//*[@id="tbodyMTablo"]/tr[29]/td[2]').text
kvb_2 = driver.find_element_by_xpath('//*[@id="tbodyMTablo"]/tr[29]/td[3]').text
It is a numerical data but stored as str(probably because of .text) and int(nf2)
or float(nf2
) didn't work.
Is there any way to store as values in first place?
( without .text
it returns 0)
What is the proper way to scrape numerical data and store it in dataframe?
Thanks in advance.
Upvotes: 1
Views: 1264
Reputation: 425
try using .get_attribute('innerHTML') instead of .text
edit*
It seems that you are trying to convert selenium object into int(). but int requires a string to convert(that contains only numbers).
So, you can try to convert it like this.
"this example is about scraping a number inside of a field on a random page on Wikipedia; try to adapt it to your code."
from selenium import webdriver
driver = webdriver.Chrome()
driver.get('https://it.wikipedia.org/wiki/Internet#Nascita_del_World_Wide_Web_.281991.29')
scraped = driver.find_element_by_xpath('//span[@class="tocnumber" and contains(text(), "1")]')
print(int(scraped.get_attribute('innerHTML')))
driver.quit()
Upvotes: 1