dsignr
dsignr

Reputation: 2354

Ruby on Rails and NoSQL, adding fields

I'm just diving into Mongodb and MongoID with Rails and I find it awesome. One thing the NoSQL helps is when I can add extra fields to my model without any extra effort whenever I want:

class Page
  include Mongoid::Document
  include Mongoid::MultiParameterAttributes
  field :title, :type => String
  field :body, :type => String
  field :excerpt, :type => String #Added later
  field :location, :type => String #Added later
  field :published_at, :type => Time

  validates :title, :presence => true
  validates :body, :presence => true
  validates :excerpt, :presence => true
end

And this works perfectly as it should. But my question is, (sorry if this is trivial) the existing entries are blank and have no defined value for the newly added field. For example, in a sample blog application, after I've published two posts, I decide to add an excerpt and a location field to my database (refer code above). Any blog post that is published after the addition of these new fields can be made sure to have a value filled in for the excerpt field. But the posts published prior to the addition of these two new fields have null values (which is understandable why) which I cannot validate. Is there an elegant solution for this?

Thank you.

Upvotes: 4

Views: 2259

Answers (1)

mu is too short
mu is too short

Reputation: 434665

There are three basic options:

  1. Update everything inside MongoDB to include the excerpt.
  2. Use an after_initialize hook to add a default excerpt to existing objects when you pull them out of MongoDB.
  3. Kludge your validation logic to only check for the existence of excerpt on new objects.

(1) requires a (possible large) time hit when you make the change but it is just a one time thing and you don't have to worry about it after that. You'd pull every Page out of MongoDB, do page.excerpt = 'some default excerpt', and then save it back to MongoDB. If you have a lot of Pages you'll want to process them in chunks of, say, 100 at a time. If you do this, you'll be able to search on the excerpt without worrying about what you should do with nulls. You can also do this inside MongoDB by sending a JavaScript fragment into MongoDB:

connection.eval(%q{
    db.pages.find({}, { _id: true }).forEach(function(p) {
        db.pages.update(
            { _id: p._id },
            { $set: { excerpt: 'some default excerpt' } }
        );
    });
})

(2) would go something like this:

after_initialize :add_default_excerpt, :unless => :new_record?
#...
private
def add_default_excerpt
  self.excerpt = 'some default excerpt' unless self.excerpt.present?
end

You could move the unless self.excerpt up to the :unless if you didn't mind using a lambda:

after_initialize :add_default_excerpt, :unless => ->{ |o| o.new_record? || o.excerpt.present? }
#...
private
def add_default_excerpt
  self.excerpt = 'some default excerpt'
end

This should be pretty quick and easy to set up but there are downsides. First of all, you'd have a bunch of nulls in your MongoDB that you might have to treat specially during searches. Also, you'd be carrying around a bunch of code and logic to deal with old data but this baggage will be used less and less over time. Furthermore, the after_initialize calls do not come for free.

(3) requires you to skip validating the presence of the excerpt for non-new Pages (:unless => :new_record?) or you'd have to find some way to differentiate new objects from old ones while also properly handling edits of both new and old Pages. You could also force people to supply an excerpt when they change a Page and leave your validation as-is; including a :default => '' on your field :excerpt would take care of any nil issues in views and such.


I'd go with (1) if possible. If the update would take too long and you wanted the site up and running while you were fixing up MongoDB, you could add a :default => '' while updating and then remove the :default option, restart, and manually patch up any strays that got through.

Upvotes: 4

Related Questions