Reputation: 2420
My question is somewhat similar to this - Active Record LIMIT within GROUP_BY
I want to limit the ActiveSupport::OrderedHash to a specific count(100) for each site.
The limit can be anything like 100 or 4
For simplicity I have considered it as 4
Session.website_only.during(date_range)
.count(group: [:site_id, :referrer_host],
order: 'count_all DESC',
limit: 4)
The SQL query generated looks like
SELECT COUNT(*) AS count_all, site_id AS site_id,
referrer_host AS referrer_host FROM "sessions"
WHERE "sessions"."created_at" >= '2013-12-09 00:00:00.000000'
AND "sessions"."created_at" <= '2013-12-16 23:59:59.999999' AND
(referrer_host IS NOT NULL)
AND (("sessions"."referrer_host" NOT ILIKE '%google.com%'
AND "sessions"."referrer_host" NOT ILIKE '%yahoo.com%'
AND "sessions"."referrer_host" NOT ILIKE '%bing.com%'
AND "sessions"."referrer_host" NOT ILIKE '%aol.com%'))
AND (("sessions"."referrer_host" NOT ILIKE '%twitter.com%'
AND "sessions"."referrer_host" NOT ILIKE '%facebook.com%'
AND "sessions"."referrer_host" NOT ILIKE '%linkedin.com%'
AND "sessions"."referrer_host" NOT ILIKE '%fb.me%'))
GROUP BY "sessions"."site_id", "sessions"."referrer_host"
ORDER BY count_all DESC LIMIT 4
Updated Question
What I get
A ActiveSupport::Ordered Hash that has counts from all sessions for a site grouped as site_id
and referrer_host
This is an example of actual result, it has the grouped hash but limited on entire set, what I want is the grouping should be limited to 100 only.
{[1, "https"]=>8769, [1, "www.example.com"]=>2359, [1, "www.xyz.com"]=>1935, [1, "www.bayers.com"]=>379,
[2, "www.ruby.com"]=>1322, [2, "www.employment.com"]=>472, [2, "https"]=>424,
[3, "www.rails.com"]=>424, [3, "www.arizona.net"]=>392, [3, "www.murphy.com"]=>390,
[4, "www.associates.com"]=>374, [4, "www.reddit.com"]=>365, [4, "www.razorshape.com"]=>352,
[5, "www.rediff.com"]=>337, [5, "www.tumbleweed.com"]=>327, [5, "www.arizona.com"]=>289,
[6, "https"]=>275, [131, "www.example.com"]=>253, [6, "www.murphy.com"]=>236, [6, "www.associates.com"]=>227}
What I want
Instead of each any number of sets within each group I want to restrict it to 4.
Upvotes: 0
Views: 1868
Reputation: 2864
I don't think there's a way to do it in the database without calculating the values for all of the rows and then filtering. So in that case, I'd rather filter it in ruby, which makes the code clearer for reading. Something like this:
data = {[1, "https"]=>8769, [1, "www.example.com"]=>2359, [1, "www.xyz.com"]=>1935, [1, "www.bayers.com"]=>379,
[2, "www.ruby.com"]=>1322, [2, "www.employment.com"]=>472, [2, "https"]=>424,
[3, "www.rails.com"]=>424, [3, "www.arizona.net"]=>392, [3, "www.murphy.com"]=>390,
[4, "www.associates.com"]=>374, [4, "www.reddit.com"]=>365, [4, "www.razorshape.com"]=>352,
[5, "www.rediff.com"]=>337, [5, "www.tumbleweed.com"]=>327, [5, "www.arizona.com"]=>289,
[6, "https"]=>275, [131, "www.example.com"]=>253, [6, "www.murphy.com"]=>236, [6, "www.associates.com"]=>227}
limit = 4 # or 100
#Create a hash that has arrays on the keys
counts = Hash.new(0)
result = Hash.new
data.each do |k, v|
site = k[0]
if counts[site] < limit
counts[site]+=1
result[k]=v
end
end
puts counts
puts result
The final format of the counts
structure is not exactly the same as that of the data
structure, but it can easily be converted back. The running code can be found in http://rubyfiddle.com/riddles/26fc3/2
Upvotes: 2