BlackBat
BlackBat

Reputation: 61

Python/scrapy - response.replace() method doesnt work?

When trying to change the response.url with response.replace before calling a yield Request, i get the same results ? The syntax seems to be correct tough.

print(response.url)
response.replace(url='https://techcrunch.com/search/heartbleed#stq=heartbleed&stp=2')
print(response.url)  

next = self.driver.find_element(By.XPATH,"//a[@class='page-link next']")  
nextpage = next.get_attribute("href")  
yield scrapy.Request(url=nextpage, dont_filter=False)

note :
1. im assigning the url twice (obv. not needed if it would work ... grrr)
2.nextpage is the exact same url as in the 2 line of the code

output:

https://techcrunch.com/search/heartbleed
https://techcrunch.com/search/heartbleed
2017-06-15 15:09:55 [selenium.webdriver.remote.remote_connection] DEBUG: POST http://127.0.0.1:56740/wd/hub/session/e3ba0740-51cb-11e7-acb6-f1825cec3f42/element {"using": "xpath", "sessionId": "e3ba0740-51cb-11e7-acb6-f1825cec3f42", "value": "//a[@class='page-link next']"}
2017-06-15 15:09:55 [selenium.webdriver.remote.remote_connection] DEBUG: Finished Request
2017-06-15 15:09:55 [selenium.webdriver.remote.remote_connection] DEBUG: GET http://127.0.0.1:56740/wd/hub/session/e3ba0740-51cb-11e7-acb6-f1825cec3f42/element/:wdc:1497532195411/attribute/href {"sessionId": "e3ba0740-51cb-11e7-acb6-f1825cec3f42", "name": "href", "id": ":wdc:1497532195411"}
2017-06-15 15:09:55 [selenium.webdriver.remote.remote_connection] DEBUG: Finished Request  

i have the feeling that this is the reason why i cant go to other links, since the response always stays on the same site, instead of following the new links

Upvotes: 1

Views: 3124

Answers (1)

Pablo
Pablo

Reputation: 217

i guess the replace method does not perform operation in place but return the result :

replace([url, status, headers, body, request, flags, cls])
Returns a Response object with the same members, except for those members given new values by whichever keyword arguments are specified.

So i would try something like :

new_response = response.replace(whatever=value)

Upvotes: 3

Related Questions