Reputation: 1855
I have a table where the primary key (id) isn't the key by which i distinguish between records. For this, I have a unique constraint for 3 columns. In order to be able to merge records, I've added a classmethod that retrieves the relevant record if exists, otherwise, it returns with a new record.
class Foo(Base):
__table_args__ = (sa.UniqueConstraint('bar', 'baz', 'qux'),)
id = sa.Column(Identifier, sa.Sequence('%s_id_seq' % __tablename__), nullable=False, primary_key=True)
bar = sa.Column(sa.BigInteger)
baz = sa.Column(sa.BigInteger)
qux = sa.Column(sa.BigInteger)
a1 = sa.Column(sa.BigInteger)
a2 = sa.Column(sa.BigInteger)
@classmethod
def get(cls, bar=None, baz=None, qux=None, **kwargs):
item = session.query(cls).\
filter(cls.bar== bar).\
filter(cls.baz == baz).\
filter(cls.qux == qux).\
first()
if item:
for k, v in kwargs.iteritems():
if getattr(item, k) != v:
setattr(item, k, v)
else:
item = cls(bar=bar, baz=baz, qux=qux, **kwargs)
return item
This works well most of the time, but every once in a while, I get an Integrity error when trying to merge an item:
foo = Foo.get(**item)
session.merge(foo)
As I understand, this happens since merge tries to insert a record where a record having the unique fields already exists.
Is there something wrong with the get
function? What am I missing here?
(BTW: I realize this might look awkward, but I need a unique sequential ID and to avoid problems with DB's not supporting sequences on non-primary-key columns, I made it this way)
Edit 1: Changed orm.db to session so the example would be clearer
Edit 2: I have this system running on several platforms and it seems that this only happens in mysql on top of Ubuntu (other platform is Oracle on top of RedHat). Also, in some weird way, it happens much more to specific end-users. Regarding mysql, I tried both mysql and mysql+mysqldb as connection strings but both produce this error. Regarding the end users, it makes no sense, and I don't know what to make of it ... Regarding mysql,
Upvotes: 2
Views: 2788
Reputation: 1855
I seem to have found the culprit!
Yesterday I had a situation where clients got the IntegrityError due to wrong timezones, so it got me to think at that direction. One of the fields I was using to identify the models is a Date column (I had no idea it is related, hence I didn't even mention it, sorry ...), however, since I call the actions from a Flex client using AMF, and since actionscript doesn't have a Date object without time, I'm transferring a Date object with zeroed time. I guess that in some situations, I got a different time in those dates which raised an IntegrityError.
I would have expected SA to strip the time from date values in case of Date columns as I think the DB would have, so Model.date_column == datetime_value
should cast the datetime to datetime.date before making the comparison.
As for my solution, I simply make sure that the value is cast to datetime.date() before I query the DB ... So far, yesterday and today was quiet with no complaints. I'll keep and eye and report should anything changes ...
Thank you all for your help. Cheers, Ofir
Upvotes: 0
Reputation: 133849
Indeed your method is prone to integrity errors. What happens is that when you invoke Foo.get(1,2,3) 2 times, you do not flush the session in between. The second time you invoke it, the ORM query fails again - because there is no actual row in the DB yet - and a new object is created, with different identity. Then on commit/flush these 2 clash causing the integrity error. This can be avoided by flushing the DB after each object creation.
Now, things work differently in SQLAlchemy if the primary key is known in merge/ORM get - if the matching primary key is found already loaded in the session, SQLAlchemy realizes that these 2 must be the same object. However no such checks are done on unique indexes. It might be possible to do that also. However it would make race conditions only rarer, as there might be 2 sessions creating the same (bar, baz, qux) triplet at the same time.
TL;DR:
else:
item = cls(bar=bar, baz=baz, qux=qux, **kwargs)
session.add(item)
session.flush()
Upvotes: 2