Matilda
Matilda

Reputation: 1718

python Storm ORM and avoiding duplicates

I'm new to python and I just started using Storm with python as a basic ORM.

I have a lot of data in a file and there are some duplicates, and to identify them you can see the ids of some of the rows are duplicated.

I want to insert them into my db, I did set id to be primary key so it can't have duplicates. I want my code to ignore inserting data in the table if it's a duplicate. But instead it just fails with _mysql_exceptions.IntegrityError: (1062, "Duplicate entry '75083587476530022' for key 'PRIMARY'")

this is the definition for my class

from storm.locals import *

class Board(object):
  __storm_table__ = 'boards'
  id = Int(primary=True)
  description = Unicode()
  category = Unicode()

  def __init__(self, val): 
    self.id = val['id']
    self.description = val['description']
    self.category = val['category']

and to create a row I do:

database = create_database('mysql://root@/mydb') 
store = Store(database)
data = {u'description': u'', u'id': 165366686256470180, u'category': u'Children'}
store.add(Board(data))
store.commit()
store.flush()

Also I know in MYSQL which I can do

ON DUPLICATE KEY UPDATE o

any idea how to make Storm use this?

Upvotes: 0

Views: 386

Answers (2)

Seif
Seif

Reputation: 1097

If you want to let the DB handle that, I would suggest:

from MySQLdb import IntegrityError
# You connected to DB and have your store
# You loaded your data inside data_list
for data_row in data_list:
    try:
        store.add(Board(data_row))
        store.flush() # This is enough to raise any DB error
    except IntegrityError:
        continue # You can probably log something here
store.commit()

Upvotes: 0

RParadox
RParadox

Reputation: 6881

I would write a python script which removes duplications before SQL insertions. Otherwise you have to commit each row, catch the exception and ignore the row. Which is not nice. It's way better to have clean data and commit all of it in one go. SQL is really only concerned with table operations, not with data integrity.

Upvotes: 2

Related Questions