How to skip over child element with Scrapy

Question

I'm looking to scrape just the job description from this page: https://www.aha.io/company/careers/current-openings/customer_success_specialist_project_management_us

I'd like to get all of the text and HTML inside the div with the class of "container py2 content job", EXCEPT the button. It's in an tag with the class of "btn btn-large btn-secondary".

I've got two different xpath selectors that I thought should work, but don't. The first doesn't exclude the button and the second gets rid of all of the other HTML, which I'd like to keep.

response.xpath('//div[@class ="container py2 content job"] 
[not(parent::a/@class="btn btn-large btn-secondary")]').extract()

response.xpath('//div[@class ="container py2 content 
job"]/descendant::text()[not(parent::a/@class="btn btn-large btn- 
secondary")]').extract()

Neither is scraping all of the HTML in the div minus what's inside the a tag. I'm hoping there's something simple that I'm missing, but I can't find what I'm looking for in the documentation.

How to skip over child element with Scrapy

Answers (1)

Related Questions