JHo
JHo

Reputation: 1088

Group and count activity by week in either Ruby, Activerecord, or Postgresql

I have an activity log that stretches across a few years. I have been asked to calculate weekly engagement for each user for the application. I define engagement as a user doing one or more logged activities in any given week.

How do I group those activities and count them by week for each user? I have read a lot of different posts, and there seems to be debate about whether ruby methods, sql or arel syntax are best. I don't have more than 500 users so performance is not a concern as much as something that is succinct.

I have successfully tried this:

user = User.first.activity_logs.group_by { |m| m.created_at.beginning_of_week } 
       # => {Mon, 11 Mar 2013 00:00:00 EDT -04:00=>
             [#<ActivityLog id: 12345, user_id: 429, ... ]}

Then the only next step I can get to return anything without error:

user.map { |week| week.count } => [2, 2, 2, 2, 2, 2, 2, 2]

So it seems like I am making this too complicated. How do I succinctly count the number of activities by week and do that for each user?

I just want something that I can ultimately paste into a spreadsheet (for example, below) to make a heat map or some other chart for managers.

| User          | Week            | Activity|
| ------------- | :-------------: | -------:|
| jho           | 2013-1          | 20      |
| bmo           | 2013-1          | 5       |
| jlo           | 2013-1          | 11      |
| gdo           | 2013-2          | 2       |
| gdo           | 2013-5          | 3       |
| jho           | 2013-6          | 5       |

EDIT

As reference for others:
Rails 3.1
Using PostgreSQL 9.1.4
Here is the schema file from ruby on rails

create_table "activity_logs", :force => true do |t|
  t.integer  "user_id"
  t.string   "activity_type"
  t.datetime "created_at"
  t.datetime "updated_at"
end

| ------+| --------+| ----------------+| ----------------+ | ----------------+ | 
| id     | user_id  | activity_type    | created_at        | updated_at        | 
| ------+| --------+| ----------------+| ----------------+ | ----------------+ | 
| 28257  | 8        | User Signin      | 2013-02-14 1...   | 2013-02-14 1...   | 
| 25878  | 7        | Password Res...  | 2013-02-03 1...   | 2013-02-03 1...   | 
| 25879  | 7        | User Signin      | 2013-02-03 1...   | 2013-02-03 1...   | 
| 25877  | 8        | Password Res...  | 2013-02-03 1...   | 2013-02-03 1...   | 
| 19325  | 8        | Created report   | 2012-12-16 0...   | 2012-12-16 0...   | 
| 19324  | 9        | Added product    | 2012-12-16 0...   | 2012-12-16 0...   | 
| 18702  | 8        | Added event      | 2012-12-15 1...   | 2012-12-15 1...   | 
| 18701  | 1        | Birthday Email   | 2012-12-15 0...   | 2012-12-15 0...   | 
| ------+| --------+| ----------------+| ----------------+ | ----------------+ | 

SOLUTION

Modifying @Erwin Brandstetter's command, I got the desired result like so on the command line:

ActivityLogs.find_by_sql("
  SELECT user_id, to_char(created_at, 'YYYY-WW') AS week, count(*) AS activity
  FROM   activity_logs
  GROUP  BY 1, 2
  ORDER  BY 1, 2;")

Upvotes: 3

Views: 1178

Answers (2)

ideamotor
ideamotor

Reputation: 876

Here it is in Postgresql. The trick here is you need to generate your year-weekofyear value. Here I am pulling out information from the date and concatenating it.

Here I am ensuring that '2012-01-01' does not get counted as the 52nd week. I'm overriding the standard. You may need to alter this function depending on how you define your weeks.

create temp table daily_log(person character varying, activity numeric, 
    dayof date);
insert into daily_log values 
     ('bob'    ,1,'2012-01-01')
    ,('bob'    ,1,'2012-01-02')
    ,('bob'    ,0,'2012-01-14')
    ,('charlie',1,'2012-01-01')
    ,('charlie',1,'2012-01-14')

select person 
 ,extract('year' from dayof) || '-' || 
    case when extract('week' FROM dayof) >= 52 
         and extract('month' FROM dayof) = 1 
    then 1 
 else extract('week' FROM dayof) end as weekof
,sum(activity) as activity_cnt
from daily_log
group by weekof, person
order by person, weekof;

That will get you:

| person        | weekof          | activity_cnt|
| -------------:| :--------------:| -----------:|
| bob           | 2012-1          | 2           |
| bob           | 2012-2          | 0           |
| charlie       | 2012-1          | 1           |
| charlie       | 2012-2          | 1           |

Why I used 2012, I don't know.

Here is what the postgresl manual says about extracting the week (http://www.postgresql.org/docs/9.2/static/functions-datetime.html):

"The number of the week of the year that the day is in. By definition (ISO 8601), the first week of a year contains January 4 of that year. (The ISO-8601 week starts on Monday.) In other words, the first Thursday of a year is in week 1 of that year. (for timestamp values only)

Because of this, it is possible for early January dates to be part of the 52nd or 53rd week of the previous year. For example, 2005-01-01 is part of the 53rd week of year 2004, and 2006-01-01 is part of the 52nd week of year 2005."

Upvotes: 1

Erwin Brandstetter
Erwin Brandstetter

Reputation: 657882

I borrowed the test table from @ideamotor and simplified it. Type of activity is irrelevant, counting each activity as 1:

CREATE TEMP TABLE log(usr text, day date);
INSERT INTO log VALUES 
  ('bob' , '2012-01-01')
 ,('bob' , '2012-01-02')
 ,('bob' , '2012-01-14')
 ,('susi', '2012-01-01')
 ,('susi', '2012-01-14');

Query (won't get much more succinct than this):

SELECT usr, to_char(day, 'YYYY-WW') AS week, count(*) AS activity
FROM   log
GROUP  BY 1, 2
ORDER  BY 1, 2;

Result:

usr  | week     | activity
-----+----------+---------
bob  | 2012-01  | 2
bob  | 2012-02  | 1
susi | 2012-01  | 1
susi | 2012-02  | 1

to_char() makes this very simple. I quote the manual here:

WW week number of year (1-53) (The first week starts on the first day of the year.)

As alternatice consider:

IW ISO week number of year (01 - 53; the first Thursday of the new year is in week 1.)

Upvotes: 2

Related Questions