Reputation: 1
I'm trying to run spyder to extract real estate advertisements informaiton.
My code:
import scrapy
from ..items import RealestateItem
class AddSpider (scrapy.Spider):
name = 'Add'
start_urls = ['https://www.exampleurl.com/2-bedroom-apartment-downtown-4154251/']
def parse(self, response):
items = RealestateItem()
whole_page = response.css('body')
for item in whole_page:
Title = response.css(".obj-header-text::text").extract()
items['Title'] = Title
yield items
After running in console:
scrapy crawl Add -o Data.csv
In .csv file I get
['\n 2-bedroom-apartment ']
Tried adding strip method to function:
Title = response.css(".obj-header-text::text").extract().strip()
But scrapy returns:
Title = response.css(".obj-header-text::text").extract().strip()
AttributeError: 'list' object has no attribute 'strip'
Is there are some easy way to make scrapy return into .csv file just:
2-bedroom-apartment
Upvotes: 0
Views: 215
Reputation: 2564
AttributeError: 'list' object has no attribute 'strip'
You get this error because .extract()
returns a list, and .strip()
is a method of string.
If that selector always returns ONE item, you could replace it with .get()
[or extract_first()
] instead of .extract()
, this will return a string of the first item, instead of a list. Read more here.
If you need it to return a list, you can loop through the list, calling strip in each item like:
title = response.css(".obj-header-text::text").extract()
title = [item.strip() for item in title]
You can also use an XPath selector, instead of a CSS selector, that way you can use normalize-space to strip whitespace.
title = response.xpath('normalize-space(.//*[@class="obj-header-text"]/text())').extract()
This XPath may need some adjustment, as you didn't post the source I couldn't check it
Upvotes: 1