Reputation: 906
I'm building a web application in Python 3 using Flask & SQLAlchemy (via Flask-SQLAlchemy; with either MySQL or SQLite), and I've run into a situation where I'd like to reference a single property on my model class that encapsulates multiple columns in my database. I'm pretty well versed in MySQL, but this is my first real foray into SQLAlchemy beyond the basics. Reading the docs, scouring SO, and searching Google have led me to two possible solutions: Hybrid attributes (docs) or Composite columns (docs).
My question is what are the implications of using each of these, and which of these is the appropriate solution to my situation? I've included example code below that's a snippet of what I'm doing.
Background: I'm developing an application to track & sort photographs, and have a DB table in which I store the metadata for these photos, including when the picture was taken. Since photos are taken in a specific place, the taken date & time have an associated timezone. As SQL has a notoriously love/hate relationship with timezones, I've opted to record when the photo was taken in two columns: a datetime storing the date & time and a string storing the timezone name. (I'd like to sidestep the inevitable debate about how to store timezone aware dates & times in SQL, please.) What I would like is a single parameter on the model class that can I can use to get a proper python datetime object, and that I can also set like any other column.
Here's my table:
class Photo(db.Model):
__tablename__ = 'photos'
id = db.Column(db.Integer, primary_key=True)
...
taken_dt = db.Column(db.datetime, nullable=False)
taken_tz = db.Column(db.String(64), nullable=False)
...
Here's what I have using a hybrid parameter (added to the above class, datetime/pytz code is psuedocode):
@hybrid_parameter
def taken(self):
return datetime.datetime(self.taken_dt, self.taken_tz)
@taken.setter(self, dt):
self.taken_dt = dt
self.taken_tz = dt.tzinfo
From there I'm not exactly sure what else I need in the way of a @taken.expression
or @taken.comparator
, or why I'd choose one over the other.
Here's what I have using a composite column (again, added to the above class, datetime/pytz code is psuedocode):
taken = composite(DateTimeTimeZone._make, taken_dt, taken,tz)
class DateTimeTimeZone(object):
def __init__(self, dt, tz):
self.dt = dt
self.tz = tz
@classmethod
def from_db(cls, dt, tz):
return DateTimeTimeZone(dt, tz)
@classmethod
def from_dt(cls, dt):
return DateTimeTimeZone(dt, dt.tzinfo)
def __composite_values__(self):
return (self.dt, self.tz)
def value(self):
#This is here so I can get the actual datetime.datetime object
return datetime.datetime(self.dt, self.tz)
It would seem that this method has a decent amount of extra overhead, and I can't figure out a way to set it like I would any other column directly from a datetime.datetime object without instantiating the value object first using .from_dt
.
Any guidance on if I'm going down the wrong path here would be welcome. Thanks!
Upvotes: 3
Views: 3798
Reputation: 21532
TL;DR: Look into hooking up an AttributeEvent
to your column and have it check for datetime
instances which have a tz
attribute set and then return a DateTimeTimeZone
object. If you look at the SQLAlchemy docs for Attribute Events you can see that you can tell SQLAlchemy
to listen to an attribute-set event and call your code on that. In there you can do any modification to the value being set as you like. You can't however access other attributes of the class at that time. I haven't tried this in combination with composites yet, so I don't know if this will be called before or after the type-conversion of the composite. You'd have to try.
edit: Its all about what you want to achieve though. The AttributeEvent
can help you with your data consistency, while the hybrid_property
and friends will make querying easier for you. You should use each one for it's intended use-case.
hybrid_attribute
and composite
are two completely different beasts. To understand hybrid_attribute
one first has to understand what a column_property
is and can do.
This one is placed on a mapper and can contain any selectable
. So if you put an concrete sub-select into a column_property
you can access it read-only as if it were a concrete column. The calculation is done on the fly. You can even use it to search for entries. SQLAlchemy
will construct the right select containing your sub-select for you.
class User(Base):
id = Column(Integer, primary_key=True)
first_name = Column(Unicode)
last_name = Column(Unicode)
name = column_property(first_name + ' ' + last_name)
category = column_property(select([CategoryName.name])
.select_from(Category.__table__
.join(CategoryName.__table__))
.where(Category.user_id == id))
db.query(User).filter(User.name == 'John Doe').all()
db.query(User).filter(User.category == 'Paid').all()
As you can see, this can simplify a lot of code, but one has to be careful to think of the performance implications.
A hybrid_attribute
is just like a column_property
but can call a different code-path when you are in an instance context. So you can have the selectable
on the class level but a different implementation on the instance level. With a hybrid_method
you can even parametrize both sides.
This is what enables you to combine multiple concrete columns to a logical single one. You have to write a class for this logical column so that SQLAlchemy
can extract the correct values from there and use it in the selects. This integrates neatly in the query framework and should not impose any additional problems. In my experience the use-cases for composite columns are rather rare. Your use-case seems fine. For modification of values you can always use AttributeEvents
. If you want to have the whole instance available you'd have to have a MapperEvent
called before flush. This certainly works, as I used this to implement a completely transparent Audit Trail tracking system which stored every value changed in every table in a separate set of tables.
Upvotes: 4