Reputation: 123
I have made a program to extract materials online as follow. It works and do generate csv file. However, the data seems not to be comma separated as seen in excel file. How can I fix that to make the file to be comma separated?
import scrapy
class JPItem(scrapy.Item):
question_title = scrapy.Field()
question_content = scrapy.Field()
question_link = scrapy.Field()
best_answer = scrapy.Field()
best_answer_link = scrapy.Field()
class JPSpider(scrapy.Spider):
name = "jp"
allowed_domains = ['detail.chiebukuro.yahoo.co.jp']
start_urls = [
'https://detail.chiebukuro.yahoo.co.jp/qa/question_detail/q' + str(x)
for x in range (10000000000,100000000000)
]
def parse(self, response):
item = JPItem()
item['question_title'] = response.css("div.mdPstd.mdPstdQstn.sttsRslvd.clrfx div.ttl h1::text").extract_first()
item['question_content'] = ''.join([i for i in response.css("div.mdPstdQstn div.ptsQes p::text").extract()])
item['question_link'] = ''.join(response.css("div.mdPstdQstn p:not([class]) a::text").extract())
item['best_answer'] = ''.join([i for i in response.css("div.mdPstdBA div.ptsQes p.queTxt::text").extract()])
item['best_answer_link'] = ''.join(response.css("div.mdPstdBA p:not([class]) a::text").extract())
yield item
Upvotes: 1
Views: 933
Reputation: 101
Every item
property returns as a list, which is why they look comma-separated in your file. However, the last four item properties you're dealing with won't be lists, because you're using ''.join()
on them. And if you want each list item to populate its own cell in a csv file in Excel, you'll need to iterate through your lists and yield
each one separately.
Upvotes: 2