Sam T
Sam T

Reputation: 1053

Replace before save to CSV

I'm using scrapy's export to CSV but sometimes the content I'm scraping contains quotes and comma's which i don't want.

How can I replace those chars with nothing '' before outputting to CSV?

Heres my CSV containing the unwanted chars in the strTitle column:

strTitle,strLink,strPrice,strPicture
"TOYWATCH 'Metallic Stones' Bracelet Watch, 35mm",http://shop.nordstrom.com/s/toywatch-metallic-stones-bracelet-watch-35mm/3662824?origin=category,0,http://g.nordstromimage.com/imagegallery/store/product/Medium/11/_8412991.jpg

Heres my code which errors on the replace line:

def parse(self, response):
    hxs = Selector(response)
    titles = hxs.xpath("//div[@class='fashion-item']")
    items = []
    for titles in titles[:1]:
        item = watch2Item()
        item ["strTitle"] = titles.xpath(".//a[@class='title']/text()").extract()
        item ["strTitle"] = item ["strTitle"].replace("'", '').replace(",",'') 
        item ["strLink"] = urlparse.urljoin(response.url, titles.xpath("div[2]/a[1]/@href").extract()[0])
        item ["strPrice"] = "0"
        item ["strPicture"] = titles.xpath(".//img/@data-original").extract()
        items.append(item)
    return items

Upvotes: 1

Views: 598

Answers (2)

Sam T
Sam T

Reputation: 1053

In the end the solution was:

item["strTitle"] = [titles.xpath(".//a[@class='title']/text()").extract()[0].replace("'", '').replace(",",'')]

Upvotes: 1

chishaku
chishaku

Reputation: 4643

EDIT

Try adding this line before the replace.

item["strTitle"] = ''.join(item["strTitle"])

strTitle = "TOYWATCH 'Metallic Stones' Bracelet Watch, 35mm"
strTitle = strTitle.replace("'", '').replace(",",'')  

strTitle == "TOYWATCH Metallic Stones Bracelet Watch 35mm"

Upvotes: 1

Related Questions