Cannot get headlines content while scraping

Question

I am new to scraping but I have tried every method to solve this problem but not getting the desired results. I want to scrape this site https://www.accesswire.com/newsroom/ and I want to scrape all the headlines, headlines show up when I inspect them in browser but after scraping with bs4 or selenium, I do not get the full page-Source code and also don't get the headlines as well.

I have tried time.sleep(10) but that is also not working out for me. I used selenium to get the page but that also wouldn't work for me as well. div.column-15 w-col w-col-9 this is the class, div where headlines reside

ua     = UserAgent()
header = {'user-agent':ua.chrome}
url = "https://www.accesswire.com/newsroom/"
response = requests.get(url, headers=header)
time.sleep(12)
soup = BeautifulSoup(response.content, 'html.parser')
time.sleep(12)
headline_Div = soup.find("div",{"class":"column-15 w-col w-col-9"})
print(headline_Div)

I just want to get all the headlines and headlines links on this page or at least a full page-source should be displayed so that I can manipulate it by myself.

Dalvenjia · Accepted Answer

If pull and parse is not working is because the content is dynamic, you will need selenium for the actual browser to generate the content for you

from selenium import webdriver

driver = webdriver.Firefox()
driver.get('https://www.accesswire.com/newsroom/')
headline_links = driver.find_elements_by_css_selector('a.headlinelink')
headlines = [link.get_attribute('textContent') for link in headline_links]

Cannot get headlines content while scraping

Answers (2)

Related Questions