Reputation: 361
I Wrote a scrapy program to scrape data from a site, the program does scrape successfully if I am scraping to a json file or csv file, the issue arises when I try to scrape into my postgres database, the below picture shows the error I receive, How can I fix the error:
def process_item(self, item, spider):
"""Save deals in the database.
This method is called for every item pipeline component.
"""
self.cur.execute("insert into Deals (Name,Deal_Url,Image_Url,Old_Price,Special_Price,Final_Price) values(%s,%s,%s,%s,%s,%s)",(item['Name'],item['Product_URL'],item['Image_URL'],item['Old_Price'],item['Special_Price'],item['Final_Price']))
self.connection.commit()
return item
Upvotes: 18
Views: 23000
Reputation: 18136
When an operation fails in a transaction you need to rollback, besides fixing the malicious operation (could be trying to access any value in item
that doesn't exist) you could wrap your insert statement in a try/except block:
def process_item(self, item, spider):
"""Save deals in the database.
This method is called for every item pipeline component.
"""
try:
self.cur.execute("insert into Deals (Name,Deal_Url,Image_Url,Old_Price,Special_Price,Final_Price) values(%s,%s,%s,%s,%s,%s)",(item['Name'],item['Product_URL'],item['Image_URL'],item['Old_Price'],item['Special_Price'],item['Final_Price']))
self.connection.commit()
except:
self.connection.rollback()
return item
Upvotes: 34