Tony Wang
Tony Wang

Reputation: 1021

how to return item load in scrapy loop

The code is as below , every time it returns only the first loop ,the last 9 loops disapeared .So what should I do to get all the loops ?

I have tried to add a "m = []" and m.append(l) ,but got a error "ERROR: Spider must return Request, BaseItem, dict or None, got 'ItemLoader'"

link is http://ajax.lianjia.com/ajax/housesell/area/district?ids=23008619&limit_offset=0&limit_count=100&sort=&&city_id=110000

def parse(self, response):
    jsonresponse = json.loads(response.body_as_unicode())
    for i in range(0,len(jsonresponse['data']['list'])):
        l = ItemLoader(item = ItjuziItem(),response=response)
        house_code = jsonresponse['data']['list'][i]['house_code']
        price_total = jsonresponse['data']['list'][i]['price_total']
        ctime = jsonresponse['data']['list'][i]['ctime']
        title = jsonresponse['data']['list'][i]['title']
        frame_hall_num = jsonresponse['data']['list'][i]['frame_hall_num']
        tags = jsonresponse['data']['list'][i]['tags']
        house_area = jsonresponse['data']['list'][i]['house_area']
        community_id = jsonresponse['data']['list'][i]['community_id']
        community_name = jsonresponse['data']['list'][i]['community_name']
        is_two_five = jsonresponse['data']['list'][i]['is_two_five']
        frame_bedroom_num = jsonresponse['data']['list'][i]['frame_bedroom_num']
        l.add_value('house_code',house_code)
        l.add_value('price_total',price_total)
        l.add_value('ctime',ctime)
        l.add_value('title',title)
        l.add_value('frame_hall_num',frame_hall_num)
        l.add_value('tags',tags)
        l.add_value('house_area',house_area)
        l.add_value('community_id',community_id)
        l.add_value('community_name',community_name)
        l.add_value('is_two_five',is_two_five)
        l.add_value('frame_bedroom_num',frame_bedroom_num)
        print l
        return l.load_item()

Upvotes: 0

Views: 572

Answers (1)

Granitosaurus
Granitosaurus

Reputation: 21436

The error:

ERROR: Spider must return Request, BaseItem, dict or None, got 'ItemLoader'

is slightly misleading since you can also return a generator! What is happening here is that return breaks the loop and the whole function. You can turn this function into a generator to avoid this.

Simply just replace return with yield in your last line.

return l.load_item()

to:

yield l.load_item()

Upvotes: 3

Related Questions