Reputation: 312
Creating a web-scraper in Python 2.6.4 + Scrapy toolkit. Need to do data analysis, but also my first Python learning project. Having trouble creating the SQL INSERT statement in my pipeline.py. The real query has approximately 30 attributes to insert..
First, is there a better way to write this UPDATE or INSERT algorithm? Open to improvements.
Second, here's two different syntax variations and the different errors they produce. I've tried lots of variations based on examples but can't find an example using "INSERT SET" breaking across multiple lines. What is the proper syntax?
DB is empty so we're always branching to 'INSERT' block for now.
def _conditional_insert(self, tx, item):
# create record if doesn't exist.
tx.execute("SELECT username FROM profiles_flat WHERE username = %s", (item['username'][0], ))
result = tx.fetchone()
if result:
# do row UPDATE
tx.execute( \
"""UPDATE profiles_flat SET
username=`%s`,
headline=`%s`,
age=`%s`
WHERE username=`%s`""", ( \
item['username'],
item['headline'],
item['age'],)
item['username'],)
)
else:
# do row INSERT
tx.execute( \
"""INSERT INTO profiles_flat SET
username=`%s`,
headline=`%s`,
age=`%s` """, ( \
item['username'],
item['headline'],
item['age'], ) # line 222
)
Error:
[Failure instance: Traceback: <class '_mysql_exceptions.OperationalError'>: (1054, "Unknown column ''missLovely92 '' in 'field list'")
/usr/lib/python2.6/threading.py:497:__bootstrap
/usr/lib/python2.6/threading.py:525:__bootstrap_inner
/usr/lib/python2.6/threading.py:477:run
--- <exception caught here> ---
/usr/lib/python2.6/vendor-packages/twisted/python/threadpool.py:210:_worker
/usr/lib/python2.6/vendor-packages/twisted/python/context.py:59:callWithContext
/usr/lib/python2.6/vendor-packages/twisted/python/context.py:37:callWithContext
/usr/lib/python2.6/vendor-packages/twisted/enterprise/adbapi.py:429:_runInteraction
/export/home/raven/scrapy/project/project/pipelines.py:222:_conditional_insert
/usr/lib/python2.6/vendor-packages/MySQLdb/cursors.py:166:execute
/usr/lib/python2.6/vendor-packages/MySQLdb/connections.py:35:defaulterrorhandler
]
Alternative Syntax:
query = """INSERT INTO profiles_flat SET
username=`%s`,
headline=`%s`,
age=`%s` """ % \
item['username'], # line 196
item['headline'],
item['age']
tx.execute(query)
Error:
[Failure instance: Traceback: <type 'exceptions.TypeError'>: not enough arguments for format string
/usr/lib/python2.6/threading.py:497:__bootstrap
/usr/lib/python2.6/threading.py:525:__bootstrap_inner
/usr/lib/python2.6/threading.py:477:run
--- <exception caught here> ---
/usr/lib/python2.6/vendor-packages/twisted/python/threadpool.py:210:_worker
/usr/lib/python2.6/vendor-packages/twisted/python/context.py:59:callWithContext
/usr/lib/python2.6/vendor-packages/twisted/python/context.py:37:callWithContext
/usr/lib/python2.6/vendor-packages/twisted/enterprise/adbapi.py:429:_runInteraction
/export/home/raven/scrapy/project/project/pipelines.py:196:_conditional_insert
]
Upvotes: 3
Views: 1940
Reputation: 838696
You shouldn't surround values with backticks. Backticks are used to quote column names.
INSERT INTO profiles_flat (username, headline, age)
VALUES (%s, %s, %s)
Upvotes: 2