zarathustra
zarathustra

Reputation: 2080

Best practice: Which information should I store in my database?

Currently I am developing a small book rating app, where users can rate and comment on books.

Of course I have a book model:

class Book < ActiveRecord::Base
  has_many :ratings
end

and a rating model:

class Rating < ActiveRecord::Base
  belongs_to :book
end

The "overall rating value" of a rating object is calculated by different rating categories (e.g. readability, ... ). Furthermore the overall rating of one book should be calculated by all given ratings.

Now the question I am asking myself: Should I calculate/query the overall rating for every book EVERYTIME someone visits my page or should I add a field to my book model where the overall rating is (periodically) calculated and saved?

EDIT: The "calculation" I would use in this case is a simple average determination.

Example: A Book has about 200 ratings. Every rating is a composition of 10 category ratings. So I want to determine the average of one rating and in the end of all 200 ratings.

Upvotes: 2

Views: 1478

Answers (5)

Richard Peck
Richard Peck

Reputation: 76774

DB

The most efficient (although not conventional) way is to use db-level ALIAS columns, allowing you to calculate the AVG or SUM of the rating with each book call:

#app/models/book.rb
class Book < ActiveRecord::Base
   def reviews_avg category
      cat = category ? "AND `category` = \"#{category}\"" : ""
      sql = "SELECT AVG(`rating`) FROM `reviews` WHERE `book_id` = #{self.id} #{cat})
      results = ActiveRecord::Base.connection.execute(sql)
      results.first.first.to_f
   end
end

This would allow:

@book = Book.find x
@book.reviews_avg               # -> 3.5
@book.reviews_avg "readability" # -> 5

This is the most efficient because it's handled entirely by the DB:

enter image description here


Rails

You should use the average functionality of Rails:

#app/models/book.rb
class Book < ActiveRecord::Base
   has_many :ratings do
      def average category
        if category
          where(category: category).average(:rating)
        else
          average(:rating)
        end
      end
   end
end

The above will give you the ability to call an instance of a @book, and evaluate the average or total for its ratings:

@book = Book.find x
@book.reviews.average               #-> 3.5
@book.reviews.average "readability" #-> 5

--

You could also use a class method / scope on Review:

#app/models.review.rb
class Review < ActiveRecord::Base
   scope :avg, (category) -> { where(category: category).average(:rating) }
end

This would allow you to call:

@book = Book.find x
@book.reviews.avg               #-> 3.5
@book.reviews.avg "readability" #-> 5

Association Extensions

A different way (not tested) would be to use the proxy_association.target object in an ActiveRecord Association Extension.

Whilst not as efficient as a DB-level query, it will give you the ability to perform the activity in memory:

#app/models/book.rb
class Book < ActiveRecord::Base
   has_many :reviews do
     def avg category
       associative_array = proxy_association.target
       associative_array = associative_array.select{|key, hash| hash["category"] == category } if category
       ratings = associative_array.map { |a| a["rating"] }
       ratings.inject(:+) / associative_array.size #-> 35/5 = 7
     end
   end
end

This would allow you to call:

@book = Book.find x
@book.reviews.avg                # -> 3.5
@book.reviews.avg "readability"  # -> 5

Upvotes: 2

Neil Atkinson
Neil Atkinson

Reputation: 774

If the averaging of those ratings is not computationally expensive (i.e. doesn't take a long time), then just calculate it on-the-fly. This is in keeping with the idea of not prematurely optimsing (see http://c2.com/cgi/wiki?PrematureOptimization).

However, if you do want to optimise this calculation then storing it on the book model and updating the calculation on rating writes is the way to go. This is known as "caching" the result. Here is some code that will cache the average rating in the database. (There are other ways of caching).

class Book < ActiveRecord::Base
  has_many :ratings, after_add :update_average_rating

  def update_average_rating
    update_attribute(:average_rating, average_rating)
  end

  def average_rating
    rating_sum / ratings.count
  end

  def rating_sum
    ratings.reduce(0) {|sum, rating|
      sum + rating.value # assuming rating model has a value attribute
    }
  end
end

class Rating < ActiveRecord::Base
  belongs_to :book
end

Note: the above code assumes the presence of an average_rating column on your book table in your database. Remember to add this column with a migration.

Upvotes: 3

max
max

Reputation: 101871

In most cases you can get averages simply by querying the database:

average = book.reviews.average(:rating)

And in most cases its not going to be expensive enough that querying per request is going to be a real problem - and pre-mature optimization might be a waste of time and resources as Neil Atkinson points out.

However when the cost of calculation becomes an issue there are several approaches to consider which depend on the nature of the calculated data.

If the calculated data is something with merits being a resource on its own you would save it in the database. For example reports that are produced on a regular bases (daily, monthly, annual) and which need to be query-able.

Otherwise if the calculated data has a high "churn rate" (many reviews are created daily) you would use caching to avoid the expensive query where possible but stuffing the data into your database may lead to an excessive amount of slow UPDATE queries and tie up your web or worker processes.

There are many caching approaches that compliment each other:

  • etags to leverage client side caching - don't re-render if the response has not changed anyways.
  • fragment caching avoids db queries and re-rendering view chunks for data that has not changed.
  • model caching in Memcached or Redis can be used to avoid slow queries.
  • low level caching can be used to store stuff like averages.

See Caching with Rails: An overview for more details.

Upvotes: 1

nickcen
nickcen

Reputation: 1692

Have you consider to use a cached version of the rating.

rating = Rails.cache.fetch("book_#{id}_rating", expires_in: 5.minutes) do
  do the actual rating calculation here
end

Upvotes: 1

cl_lime
cl_lime

Reputation: 11

There is no need at all to recalculate the average overall rating for every page visit since it only will change when somebody actually rates the book. So just use a field AVG_RATING or something like this and update the value on every given rating.

Upvotes: 1

Related Questions