Reputation: 311
I am trying to get the details of every element of this page: https://www.mrlodge.de/wohnungen/
I did this quite often with a for loop. However this time it only returns the first element. There has to be a problem in the loop because when I am using getall() instead of get(), I get all the details I need but not ordered.
Please help
import scrapy
class MrlodgeSpiderSpider(scrapy.Spider):
name = 'mrlodge_spider'
payload = '''
{mrl_ft%5Bfd%5D%5Bdate_from%5D=&mrl_ft%5Bfd%5D%5Brent_from%5D=1000&mrl_ft%5Bfd%5D%5Brent_to%5D=8500&mrl_ft%5Bfd%5D%5Bpersons%5D=1&mrl_ft%5Bfd%5D%5Bkids%5D=0&mrl_ft%5Bfd%5D%5Brooms_from%5D=1&mrl_ft%5Bfd%5D%5Brooms_to%5D=9&mrl_ft%5Bfd%5D%5Barea_from%5D=20&mrl_ft%5Bfd%5D%5Barea_to%5D=480&mrl_ft%5Bfd%5D%5Bsterm%5D=&mrl_ft%5Bfd%5D%5Bradius%5D=50&mrl_ft%5Bfd%5D%5Bmvv%5D=&mrl_ft%5Bfd%5D%5Bobjecttype_cb%5D%5B%5D=w&mrl_ft%5Bfd%5D%5Bobjecttype_cb%5D%5B%5D=h&mrl_ft%5Bpage%5D=1}
'''
def start_requests(self):
yield scrapy.Request(url='https://www.mrlodge.de/wohnungen/', method='POST',
body = self.payload, headers={"content-type": "application/json"})
def parse(self, response):
for apartment in response.xpath("//div[@class='mrl-ft-results mrlobject-list']"):
yield {
'info': apartment.xpath(".//div[@class='obj-smallinfo']/text()").get()
}
Upvotes: 0
Views: 766
Reputation: 382
You need to change the first xpath query
class MrlodgeSpiderSpider(scrapy.Spider):
name = 'mrlodge_spider'
payload = '''
{mrl_ft%5Bfd%5D%5Bdate_from%5D=&mrl_ft%5Bfd%5D%5Brent_from%5D=1000&mrl
_ft%5Bfd%5D%5Brent_to%5D=8500&mrl_ft%5Bfd%5D%5Bpersons%5D=1&mrl_ft%5Bfd
%5D%5Bkids%5D=0&mrl_ft%5Bfd%5D%5Brooms_from%5D=1&mrl_ft%5Bfd%5D%5Brooms
_to%5D=9&mrl_ft%5Bfd%5D%5Barea_from%5D=20&mrl_ft%5Bfd%5D%5Barea_to%5D=4
80&mrl_ft%5Bfd%5D%5Bsterm%5D=&mrl_ft%5Bfd%5D%5Bradius%5D=50&mrl_ft%5Bfd
%5D%5Bmvv%5D=&mrl_ft%5Bfd%5D%5Bobjecttype_cb%5D%5B%5D=w&mrl_ft%5Bfd%5D%
5Bobjecttype_cb%5D%5B%5D=h&mrl_ft%5Bpage%5D=1}
'''
def start_requests(self):
yield scrapy.Request(
url='https://www.mrlodge.de/wohnungen/',
method='POST',
body=self.payload,
headers={"content-type": "application/json"},
)
def parse(self, response):
for apartment in response.xpath('//div[@class="mrlobject-list__item mrlobject-row"]'):
yield {
'info': apartment.xpath(".//div[@class='obj-smallinfo']/text()").get()
}
Upvotes: 2
Reputation: 22440
Try using
//div[contains(@class,'mrlobject-row')]
instead of
//div[@class='mrl-ft-results mrlobject-list']
to get the desired results.
Upvotes: 1