jl118
jl118

Reputation: 329

Speed up database query using difference between 2 columns: created_at and updated_at

In my Rails project I have a Message model and I have hundreds of thousands of messages in my database. It also has a column "Status" that can be 'queued' or 'delivered'.

When a message is created, its status becomes "queued" and obviously the created_at field is populated. After some time(I won't go into details how), the status of that message will become "delivered".

Now, for hundreds of thousands of messages, I want to group them by their delivery times. In other words, calculate the difference between updated_at and created_at and group them into 0-3 minutes, 3-5 minutes, 5-10 minutes, and over 10 minutes.

The way I currently do it is

delivery_time_data = []
    time_intervals = [{lb: 0.0, ub: 180.0}, {lb: 180.0, ub: 300.0}, {lb: 300.0, ub: 600.0},{lb: 600.0, ub: 31*3600*24}]
    time_intervals.each_with_index do |ti, i|
      @messages = Message.where(account_id: @account.id)
                      .where(created_at: @start_date..@end_date)
                      .where(direction: 'outgoing')
                      .where(status: Message::STATUS_DELIVERED)
                      .where('status_updated_at - created_at >= ?', "#{ti[:lb]} seconds")
                      .where('status_updated_at - created_at < ?', "#{ti[:ub]} seconds")
      if i == time_intervals.count - 1
        delivery_time_data.push([i+1, "Greater than #{ti[:lb]/60.to_i} minutes", @messages.count])
      else
        delivery_time_data.push([i+1, "#{ti[:lb]/60.to_i} minutes to #{ti[:ub]/60.to_i} minutes", @messages.count])
      end

It works. But it's very slow, and when I have ~200000 messages the server potentially can crash.

If I expect messages to be created fairly frequently, is it even a good idea to add index on created_at?

Thanks.

Upvotes: 0

Views: 112

Answers (1)

Marlin Pierce
Marlin Pierce

Reputation: 10079

It may be that you need the right index.

The fields you need to index are:

  • direction
  • status
  • account_id
  • created_at

So add the following index in a migration:

add_index :messages, [:direction, :status, :account_id, :created_at]

Some databases, including postgresql, can index on expressions. For best results add (updated_at - created_at) as your fifth value to index. You will have to create this with SQL instead of the rails migration.

I wouldn't worry about the added time to create records on an indexed table. I just wouldn't worry about it.

Upvotes: 2

Related Questions