rajesh bojja
rajesh bojja

Reputation: 21

how to get text out of p tags in python using scrapy?

<div class="date_info">
   <p>
      <span> Start Date :</span> October 8, 2017 <br/>
      <span> End Date  :</span> October 11, 2017  <br/>
      <span>  Time  : </span> 1:00 pm   to 12:15 pm 
   </p>
   <p> 
      <span> Phone :</span> 507 266 6703  <br/> 
      <span> Email :</span> [email protected] 
   </p> 
</div> 

how to get October 8, 2017 textvalue from above code? I tried this code :

response.css('div.date_info p:nth-child(1) span:nth-child(1)::text').extract()

But I'm getting output like this "Start Date".

Can any one help ?

Upvotes: 0

Views: 3036

Answers (2)

Umair Ayub
Umair Ayub

Reputation: 21201

Do this, notice the * operator in *::text

for div in response.css("div.date_info > p"):
    for span in p.css("span"):
         " ".join(span .css("*::text").extract()) #here you have Start Date and End Date etc

Upvotes: 0

Tom&#225;š Linhart
Tom&#225;š Linhart

Reputation: 10210

If you don't insist on using CSS, you can get it with XPath like this:

date = response.xpath('//div[@class="date_info"]/p[1]/text()').extract()[1].strip()

EDIT: Alternatively, the same using CSS:

date = response.css('div.date_info p:nth-child(1)::text').extract()[1].strip()

Upvotes: 1

Related Questions