Drew
Drew

Reputation: 8963

Per-column timestamps in SQLAlchemy

I'm trying to perform a per-field merge between entities in my SQLAlchemy datasource and entities that are pulled in from somewhere else (foreign REST API). What I want to do is something like this:

class Person:
    __tablename__ = "people"
    id = Column(Integer,primary_key=True)
    name = Column(String)
    nameDatestamp = Column(DateTime)
    address = Column(String)
    addressDatestamp = Column(DateTime)

def merge(myPerson, foreignPerson):
   if myPerson.nameDateStamp < foreignPerson.nameDateStamp:
      myPerson.name = foreignPerson.name
      myPerson.nameDateStamp = foreignPerson.nameDateStamp
   if myPerson.addressDatestamp < foreignPerson.addressDateStamp:
      myPerson.addressDatestamp = foreignPerson.addressDateStamp
      myPerson.address = foreignPerson.address

For lots of classes and lots of fields. This seems too verbose to be best-practice.

Is there a best practice for what I am trying to do?

EDIT: I'm looking for

  1. A more concise way to declare that some fields have datestamps
  2. Some way to iterate over the the datestamped fields so that my merge function doesn't have to know which key(s) it is merging
  3. Some way to keep the datestamps up-to-date whenever the value is changed

Upvotes: 1

Views: 897

Answers (1)

Jeff Tratner
Jeff Tratner

Reputation: 17076

Generic Merging

You can easily iterate over (thing, datestamp) through the query object. For example, if you wanted to get address and datestamp you could do:

session.query(Person.address, Person.addressDatestamp).all()

which would return a set of (address, addressDatestamp) tuples. (they are actually namedtuples, but you can just use the index instead). You don't actually need to do this if you have a bunch of attributes you want to update. One way you could do this dynamically is to pass merge a list of attribute tuples and a query that is a tuple of (Person, foreignPerson) and do the following:

attrs = [("address", "addressDatestamp"), ("name", "nameDatestamp")]
person_tuples = # some way to generate (Person, ForeignPerson) tuples
def merge(attrs, person_tuples):
    for person, foreign in person_tuples:
        for attr, date in attrs:
            if getattr(person, date) < getattr(foreign, date):
                setattr(person, attr) = getattr(foreign, attr)
                setattr(person, date) = getattr(foreign, date)

    return person_tuples

This checks each attribute's datestamp, and then stores the attribute if the foreign is more recent (+ also stores the date).

if you attributes are always in the form <attr> and <attr>Datestamp then you could shorten this to:

attrs = ["name", "address"]
def merge(attrs, person_tuples):
    for person, foreign in person_tuples:
        for attr in attrs:
            date = attr + "Datestamp"
            if getattr(person, date) < getattr(foreign, date):
                setattr(person, attr) = getattr(foreign, attr)
                setattr(person, date) = setattr(foreign, date)

If the attribute may sometimes not be present, you can change the getattr call to getattr(object, attr, default) and it won't raise an error.

Dynamic classes

If you want to be able to dynamically generate models with datestamps, you could use a metaclass (a bit more complicated, particularly because it interferes with SQLA's declarative base, etc) or you could create a class factory, like the following:

def datestamped_factory(class_name, attrlist, timestamp="Datestamp", superclass_list=None):
    superclass_list = superclass or (object,)
    cols = dict((attr, Column(String)) for attr in attrlist)
    cols.update(dict((attr + timestamp, Column(DateTime)) for attr in attrlist)
    cols["timestamped_attrs"] = attrlist
    # create a merge specific to the class (so only need to pass person_tuples)
    cols["merge"] = classmethod(lambda cls, person_tuples: merge(cls.timestamped_attrs, person_tuples))
    return type(class_name, superclass_list, cols)

(can just add that to the factory with cols["class_merge"] = classmethod(lambda cls, person_tuples: merge(cls.timestamped_attrs, person_tuples)))

To create your person method, you could just do something like the following:

class Base(sqlalchemy.declarative_base()):
     id = Column(Integer, primary_key=True)

Person = datestamped__factory("Person", ["name", "address"], superclass_list = (Base,))
Person.__tablename__ = "person"

(replace sqlalchemy.declarative_base() with whatever Base class you are using...assuming you're using the ORM).

You could get fancier and write up a metaclass that finds all the columns that are strings and adds datestamps to them + creates appropriate merge and create methods that update timestamps appropriately, but that's probably more fancy than you need to get.

Upvotes: 3

Related Questions