Reputation: 3665
I have an app in the works for use internally as a project/task tracker in the company that I'm working for. Playing around with MongoDB atm. I have the following pseudo-schema in mind:
task
_id
name
project
initial_notes
versions
number
versions
version_1
worker
status
date(if submitted)
review_notes(if rejected)
reply_on(if accepted/rejected)
(version_n)(if any)
The problem that I'm having is with versioning the task. I've read a number of possible ways but I'm falling short of understanding them all the way through. I read something that I liked here and really like the way mongoid does it's versioning
Thinking of it better I'd rather have it something like this
task
_id
versions
number_of_versions: 3
current_version
version_no: 3
worker: bob
status: accepted
old_versions
version
version_no: 2
worker: bob
I would like to show the most recent version only when displaying a collection of tasks and I would like to show all versions of a particular task when entering the detailed information page for that particular task. Would this structure work? If yes, what would be some queries needed to run in order to achieve what I need?
Thank you in advance for your time reading this and maybe answering it. status: rejected version version_no: 1 worker: smith status: rejected
Upvotes: 4
Views: 2597
Reputation: 157
I was dealing with the same issue, which is why I created HistoricalCollection:
https://pypi.org/project/historical-collection/
Works just like a normal collection, with the addition of a few additional methods:
patch_one()
patch_many()
find_revisions()
latest()
An example of the usage:
from historical_collection.historical import HistoricalCollection
from pymongo import MongoClient
class Users(HistoricalCollection):
PK_FIELDS = ['username', ] # <<= This is the only requirement
# ...
users = Users(database=db)
users.patch_one({"username": "darth_later", "email": "[email protected]"})
users.patch_one({"username": "darth_later", "email": "[email protected]", "laser_sword_color": "red"})
list(users.revisions({"username": "darth_later"}))
# [{'_id': ObjectId('5d98c3385d8edadaf0bb845b'),
# 'username': 'darth_later',
# 'email': '[email protected]',
# '_revision_metadata': None},
# {'_id': ObjectId('5d98c3385d8edadaf0bb845b'),
# 'username': 'darth_later',
# 'email': '[email protected]',
# '_revision_metadata': None,
# 'laser_sword_color': 'red'}]
Upvotes: 0
Reputation: 26258
You might also consider a model like this:
task
_id
...
old_versions [
{
retired_on
retired_by
...
}
]
This has the advantage that your current data is always at the top level (you don't need to track the current version explicitly, the document is its own current version), and you can easily track history by taking the current version, removing the old_versions
field, and $push
ing it to the old_versions
field in the db.
Since you seemed to want to minimize network IO, this also lets you easily avoid loading the old_versions
when you don't need them:
> db.tasks.find({...}, {old_versions: 0})
You could also get fancier and store a list of old versions only of fields that have changed. This requires more delicate code in your application layer, and may not be necessary if you don't expect many revisions or for these documents to be very large.
Upvotes: 3
Reputation: 22659
Yes, why not. That scheme would work. Also, have you considered something like this:
task
...
versions = [ # MongoDB array
{ version_id
worker
status
date(if submitted)
review_notes(if rejected)
reply_on(if accepted/rejected)
},
{ version_id : ... },
...
Possible version insertion query:
tasks.update( { # a query to locate a particular task},
{ '$push' : { 'versions', { # new version } } } )
Note, that retrieving the last version from the versions array in this case is done by the program, not by Mongo.
Upvotes: 4