With scrapy, how can get part of xpath parsed result?

Question

Here is my part of spider:

def parse(self, response):

        titles = HtmlXPathSelector(response).select('//li')
        for title in titles:
            item = EksidefeItem()
            item['favori'] = title.select("//*[@id='entry-list']/li/@data-favorite-count").extract()
            item['entry'] = ['



I am getting date and time from item['tarih'] but its not exact date and time it also has another values inside it. Here is an example of parsed data from it:


  26.01.2017 20:04 ~ 20:07


I want to use only date part (10 characters from left) as 


  26.01.2017


How can I do that?

Thanks

DBedrenko · Accepted Answer

You could use string slicing to get just the date:

item['tarih'] = title.select("//*[@id='entry-list']/li/footer/div[2]/a[1]/text()").extract()
item['tarih'][0] = item['tarih'][0][:10]

But I would also add some validation (take a look at datetime.datetime.strptime()) to make sure you got a valid date.

With scrapy, how can get part of xpath parsed result?

Answers (2)

Related Questions