AnApprentice
AnApprentice

Reputation: 110960

Ruby / Rails - How to aggregate query results in an array?

I have a large data set that I want to clean up for the user. The data set from the DB looks something like this:

ID | project_id | thread_id | action_type |description
 1 | 10         | 30        |  comment    | yada yada yada yada yada
 1 | 10         | 30        |  comment    | xxx
 1 | 10         | 30        |  comment    | yada 313133
 1 | 10         | 33        |  comment    | fdsdfsdfsdfsdfs
 1 | 10         | 33        |  comment    | yada yada yada yada yada
 1 | 10         |           | attachment  | fddgaasddsadasdsadsa
 1 | 10         |           | attachment  | xcvcvxcvxcvxxcvcvxxcv

Right now, when I output the above in my view its in the very same order as above, problem is it is very repetitive. For example, for project_id 10 & thread_id 30 you see:

10 - 30 - yada yada yada yada yada
10 - 30 - xxxxx
10 - 30 - yada yada yada yada yada

In Ruby, I'd like to learn how to create an array and aggregate descriptions under a project_id and thread_id, so instead the output is:

10 - 30
 - yada yada yada yada yada
 - xxxxx
 - yada yada yada yada yada

How can I get started? This requirement is new for me. Hopefully this can be done in Ruby and not SQL, as the activity feed is likely going to grow in event types and complexity.

Upvotes: 6

Views: 3309

Answers (2)

Harish Shetty
Harish Shetty

Reputation: 64363

I follow a simple guideline for using the group_by method of the Enumerator. - The data-set being operated on should be small and fixed and guaranteed to remain constant with time.

Eg:

Fixed data-set:  Zip codes, city names     
Dynamic but small data-set: User's hobbies    
Dynamic but paginated data-set: First page of latest orders.

In my opinion your activity feed table can grow rapidly with time . The Activity.all loads all the activities in to memory. You are incurring excessive memory and network costs by executing this call. It is never a good idea to execute the all call with out conditions and pagination. If you are currently paginating to result set, the current solution will not work when the result-set spans multiple pages. You have to use the order clause to get correct result-set.

This is what I would do:

In your controller:

# order by ensures that ordering happens at the DB
# pagination and conditions ensures that data set is small
activities = Activity.paginate(:order => "project_id, thread_id", :page => #pn)
@activity_groups = activities.group_by{|a| "#{a.project_id} - #{a.thread_id}"}

Now, you can use the @activity_groups in your view as suggested by fl00r.

Upvotes: 0

fl00r
fl00r

Reputation: 83680

Use group_by http://apidock.com/rails/Enumerable/group_by in Ruby or right in SQL. In Ruby:

sets = DataSet.all.group_by{ |data| [data.project_id, "-", data.thread_id].join(" ") }

Then you'll get Hash like that:

{ "10 - 30" => [#DataSet1, #DataSet2 ...], "10 - 33" => [#DataSet7, #DataSet11 ...]

Which you can parse in view:

<% sets.each do |range, datas| %>
  <p><%= range %>:</p>
  <% datas.each do |data| %>
    <p><%= data.description %></p>
  <% end %>
<% end %>

UPD for each_with_index

<% sets.each_with_index do |datas, index| %>
  <p><%= datas[0] %>:</p>
  <% datas[1].each do |data| %>
    <p><%= data.description %></p>
    # some stuff with *last*
    <%= "This is the last one" if data == datas[1].last %> 
  <% end %>
<% end %>

Upvotes: 11

Related Questions