Reputation: 89
I am trying to scrape the content of the article on this link: https://onlinelibrary.wiley.com/doi/full/10.1111/jvim.15224
I have used Selenium to load the page (both PhantomJS and Firefox), but I cant seem to get the article tag.
This line was to wait for the page to load:
element = WebDriverWait(driver, 20).until(EC.presence_of_element_located((By.CLASS_NAME, "article-section__sub-title section1")))
Alternatively, I also tried to wait for the article tag to load.
However, the driver continues after a couple of secs, but whenever I check the html I got after waiting, the only thing that comes out is the 'head' and 'body' tags - just tags, without their content.
Any idea what I did wrong with getting the page to load and scrape the article tag?
Upvotes: 0
Views: 129
Reputation: 193208
To scrape the article tags instead of using presence_of_element_located()
you need to use visibility_of_all_elements_located()
method and you can use the following solution:
Code Block:
driver.get("https://onlinelibrary.wiley.com/doi/full/10.1111/jvim.15224")
tags = WebDriverWait(driver, 20).until(EC.visibility_of_all_elements_located((By.CSS_SELECTOR, "h3.article-section__sub-title.section1")))
for tag in tags:
print(tag.text)
Console Output:
Background
Objective
Animals
Methods
Results
Conclusions and Clinical Importance
Upvotes: 1